Crackme 17: Silva97’s login-cipher
This crackme from 2019 is hosted on crackmes.one. You can also download the binary here.
Let’s start with the basics: the program mentions that no patching is allowed, and then asks for a single password.
$ ./login
Don't patch it!
Insert your password: hello
Wrong!
strings
tells us this was compiled with GCC on Ubuntu, and a few weird values stand out:
Gtu.}'uj{fq!p{$
Lszl{{%.
vx{!whvt|twg?%
%64[^
fhz4yhx|~g=5
Ftyynjy*
Zwvup(
The output of strings
often includes some irrelevant garbage, but these look different.
strace
doesn’t reveal any ptrace
calls, so at least we won’t have to deal with this kind of protection. The ELF header also seems untouched, which means that GDB can open and run the binary without issues.
Let’s start with Hopper and find these strings. We quickly notice that several are referenced in sub_12a1
, each one loaded into rdi
before a call to sub_1348
, which seems like it might be a decryption function:
00000000000012bd lea rdi, qword [aGtuujfqp] ; "Gtu.}'uj{fq!p{$"
00000000000012c4 call sub_1348 ; sub_1348
00000000000012c9 mov esi, 0x0
00000000000012ce lea rdi, qword [aGtuujfqp+16] ; "Lszl{{%"
00000000000012d5 call sub_1348 ; sub_1348
00000000000012da lea rax, qword [rbp+var_50]
00000000000012de mov rsi, rax
00000000000012e1 lea rdi, qword [a64n] ; "__format" for scanf: "%64[^\\n]"
00000000000012e8 mov eax, 0x0
00000000000012ed call j___isoc99_scanf ; __isoc99_scanf
00000000000012f2 lea rax, qword [rbp+var_50]
00000000000012f6 lea rsi, qword [aFhz4yhxg5] ; "fhz4yhx|~g=5"
00000000000012fd mov rdi, rax
0000000000001300 call sub_13e3 ; not sure what this one does yet
0000000000001305 test eax, eax
0000000000001307 jne loc_131c
0000000000001309 mov esi, 0x1
000000000000130e lea rdi, qword [aFtyynjy] ; "Ftyynjy*"
0000000000001315 call sub_1348 ; sub_1348
000000000000131a jmp loc_132d
loc_131c:
000000000000131c mov esi, 0x1 ; CODE XREF=sub_12a1+102
0000000000001321 lea rdi, qword [aZwvup] ; "Zwvup("
0000000000001328 call sub_1348 ; sub_1348
The next logical step would be to look at sub_1348
. The function is pretty short: it simply copies its argument into a buffer using strcpy
(this tells us that the input strings are null-terminated), and then calls sub_1218
with this buffer as its single argument. After sub_1218
returns, its output is passed to j_puts
, either with the string as a single parameter or with the string and a reference to stdout
– think puts(const char *)
vs fputs(const char *, FILE *)
.
We can also see it save its return address and check it after the call to j_puts
, which I think is a stack overflow canary added by the compiler and not logic that was part of the program’s source code. This was likely added when the compiler detected a strcpy
writing into a fixed-sized buffer.
sub_1348:
0000000000001348 push rbp
0000000000001349 mov rbp, rsp
000000000000134c sub rsp, 0x120
0000000000001353 mov qword [rbp+var_118], rdi ; copy argument into temp var
000000000000135a mov eax, esi
000000000000135c mov byte [rbp+var_11C], al
0000000000001362 mov rax, qword [fs:0x28] ; stack overflow canary
000000000000136b mov qword [rbp+var_8], rax ; stored into a local variable
000000000000136f xor eax, eax
0000000000001371 mov rdx, qword [rbp+var_118] ; move argument to rdx
0000000000001378 lea rax, qword [rbp+var_110] ; point rax to local target buffer
000000000000137f mov rsi, rdx ; "src" for j_strcpy
0000000000001382 mov rdi, rax ; "dst" for j_strcpy
0000000000001385 call j_strcpy ; call to strcpy
000000000000138a lea rax, qword [rbp+var_110]
0000000000001391 mov rdi, rax
0000000000001394 call sub_1218 ; call this other procedure (unknown so far)
0000000000001399 cmp byte [rbp+var_11C], 0x0 ; if the second arg is zero
00000000000013a0 je loc_13b3 ; jump over the following block ――――╮
; │
00000000000013a2 lea rax, qword [rbp+var_110] ; │
00000000000013a9 mov rdi, rax ; argument "__s" for method j_puts │
00000000000013ac call j_puts ; call puts with just a string │
00000000000013b1 jmp loc_13cc ; jump over the following block ――――┼――╮
; │ │
loc_13b3: ; <――――――――――――――――――――――――――――╯ │
00000000000013b3 mov rdx, qword [stdout] ; make rdx point to stdout │
00000000000013ba lea rax, qword [rbp+var_110] ; │
00000000000013c1 mov rsi, rdx ; "__stream" for method j_fputs │
00000000000013c4 mov rdi, rax ; "__s" for method j_fputs │
00000000000013c7 call j_fputs ; call puts with a string and stdout │
; │
loc_13cc: ; <―――――――――――――――――――――――――――――――╯
00000000000013cc nop
00000000000013cd mov rax, qword [rbp+var_8] ; read back the local variable
00000000000013d1 xor rax, qword [fs:0x28] ; xor with the return address
00000000000013da je loc_13e1 ; if they match, jump below
00000000000013dc call j___stack_chk_fail ; otherwise, report stack check failure
Time to look at sub_1218
. It consists of a loop going over its argument in rdi
, with 37 instructions in the body of the loop doing mostly mov
, add
, sub
, imul
, sar
and shr
. This seems to be written by hand, since I don’t think a compiler would generate such long-winded and repetitive code. There isn’t much point going over the details since they are pretty tedious, but we don’t have to explore it in more detail since it’s clear that this function has no other purpose than decrypting strings.
Next, we can figure out the actual base address used for the program by starting with a breakpoint on strcpy
:
(gdb) b strcpy
Breakpoint 1 at 0x1030
(gdb) r
We end up in strcpy
itself, so can just use ret
to get back to the caller:
(gdb) bt
#0 __strcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/strcpy-sse2-unaligned.S:47
#1 0x000055555555538a in ?? ()
#2 0x00005555555552c9 in ?? ()
#3 0x00007ffff7e2c09b in __libc_start_main (main=0x5555555552a1, argc=0x1, argv=0x7fffffffed68, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffed58) at ../csu/libc-start.c:308
#4 0x00005555555550ba in ?? ()
(gdb) ret
#0 0x000055555555538a in ?? ()
(gdb) x/4i $pc
=> 0x55555555538a: lea rax,[rbp-0x110]
0x555555555391: mov rdi,rax
0x555555555394: call 0x555555555218
0x555555555399: cmp BYTE PTR [rbp-0x11c],0x0
We recognize the instructions that follow the call to strcpy
: the lea
was originally found at 0x000000000000138a
and we now see it at 0x55555555538a
, so we can change the base address to 0x555555554000
for them to match in Hopper and GDB.
After the two strings are printed, we call sscanf
to read the password. This input is passed to sub_13e3
along with the constant string "fhz4yhx|~g=5"
. This function starts by taking its second argument (what looks like an encrypted string again) and calls sub_1175
on it before comparing the result to our password input:
00005555555553ef mov qword [rbp+var_20], rsi ; the encrypted string saved in var_20
00005555555553f3 mov rax, qword [rbp+var_20] ; then moved to rax
00005555555553f7 mov rdi, rax ; and then rdi
00005555555553fa call sub_1175 ; before calling sub_1175
...
0000555555555422 mov rax, qword [rbp+var_18] ; load var_18
0000555555555426 lea rdx, qword [rax+1] ; make rdx point to the byte that follows
000055555555542a mov qword [rbp+var_18], rdx ; and save back into var_18 (this is a counter)
000055555555542e movzx eax, byte [rax] ; read the current byte
0000555555555431 movsx eax, al ; trim to keep just 8 bits
0000555555555434 cmp dword [rbp+var_4], eax ; compare to var_4
0000555555555437 je loc_555555555404 ; loop back up if they match
This seems to be comparing our input with the decrypted string produced by sub_1175("fhz4yhx|~g=5")
.
Looking at sub_1175
, it is very similar to the earlier decryption function sub_1218
: it has the same constant 0x7b1
at the start, the same multiplication by 0x66666667
, the same bit shifts… it seems like we could try to assume that they do the same thing and see where this goes.
If we look at how the return value of sub_13e3
is handled, there are two cases:
0000555555555300 call sub_13e3 ; the decryption + comparison function
0000555555555305 test eax, eax ; if it returned 0, jump below
0000555555555307 jne loc_55555555531c ; ――――――――――――――――――――――――╮
0000555555555309 mov esi, 0x1 ; │
000055555555530e lea rdi, qword [aFtyynjy] ; "Ftyynjy*" │
0000555555555315 call sub_1348 ; sub_1348 │
000055555555531a jmp loc_55555555532d ; jump all the way down │
; │
000055555555531c mov esi, 0x1 ; <―――――――――――――――――――――――╯
0000555555555321 lea rdi, qword [aZwvup] ; "Zwvup("
0000555555555328 call sub_1348 ; sub_1348
Both of these strings are sent to sub_1348
for decrypting and printing, so we can try taking a look ourselves by just calling the function by hand. We start in GDB with a breakpoint in sub_12a1
just before a call to sub_1348
, and change rdi
to point to the string we want to investigate:
(gdb) b *0x00005555555552c4
Breakpoint 1 at 0x5555555552c4
(gdb) r
Starting program: ./login
Breakpoint 1, 0x00005555555552c4 in ?? ()
(gdb) set $rdi = 0x0000555555556040
(gdb) printf "%s", $rdi
Ftyynjy*
(gdb) c
Continuing.
Correct!
Insert your password:
Perfect! "Ftyynjy*"
decodes to "Correct!"
, so indeed the decryption code is the same. We repeat the same process for "Zwvup("
:
(gdb) set $rdi = 0x0000555555556049
(gdb) printf "%s", $rdi
Zwvup(
(gdb) c
Continuing.
Wrong!
So now let’s see what sub_1348
would do with the second argument to sub_1175
:
(gdb) set $rdi = 0x0000555555556033
(gdb) printf "%s", $rdi
fhz4yhx|~g=5
(gdb) c
Continuing.
ccs-passwd44
Let’s try it:
$ ./login
Don't patch it!
Insert your password: ccs-passwd44
Correct!
This wasn’t the most challenging crackme we’ve seen so far. Its complexity was limited to the decryption function, but the presence in the strings
output of the encrypted password and a general lack of obfuscation or reverse-engineering countermeasures made it relatively simple to break.
With this password now revealed, a Google search reveals the actual source code for the crackme. You’ll recognize the puts
vs fputs
logic, the calls to strcpy
and scanf
, and a pre-processor macro wrapping each string so that only their encrypted values are included in the binary.
Edit:
A day after publishing this walk-through, I tried opening the login
binary with
Ghidra and was amazed by how much it managed to simplify the decryption function. Its disassembler produced the following code (I only renamed the variables):
char * FUN_00101218(char *input) {
unsigned int temp;
char *cur_char;
temp = 0x7b1;
cur_char = input;
while (*cur_char != '\0') {
temp = temp * 7 & 0xffff;
*cur_char = *cur_char + ((char)(temp / 10) * '\n' - (char)temp);
cur_char = cur_char + 1;
}
return input;
}
This is so much more concise than the mess of mov
and bit shifts that we see in the raw disassembly, with most of the complexity now erased by its successive rewriting rules.
Note that \n
is also 10, so the expression (temp ÷ 10) × 10 - temp
can be reduced further to - (temp % 10)
. We can therefore rewrite it as:
def decode(input):
temp = 0x7b1
output = []
for i in range(len(input)):
temp = (temp * 7) & 0xffff
output.append(chr(ord(input[i]) - (temp % 10)))
return ''.join(output)
And test it:
>>> for input in ["Gtu.}'uj{fq!p{$", "fhz4yhx|~g=5", "Ftyynjy*", "Zwvup("]:
... print(decode(input))
...
Don't patch it!
ccs-passwd44
Correct!
Wrong!
With this kind of result, it seems like I need to start using Ghidra more regularly.