This fragment is about to be reported (you'll remain on this page): You can enter a comment to clarify the mistake if you would like to: |
A full-scaled guide to hacking a simple scenario runner requiring no initial knowledge of what Reverse Code Engineering is.
I became interested in visual novels (and anime) much earlier than I became interested in
hacking them for translation purposes. This happened near September of 2008 (I still remember
the month because that’s the time school year starts lol).
When this happened I suddenly found that I’m able to understand those messy lines of what seemed like totally unmanagable assembler code beforethat came out of OllyDbg almost scaring me to faints.
I’ve set up a page with my visual novel (or shortly – VN) tools which is
still located here, although I’m planning to significantly
improve it some day.
I wrote this tutorial to one of my Internet friends with whom we had intensive chat for several months (which resulted in more than 200 forum posts, some of which were 60-90 KiB in size – pure ANSI). This guide is intended to give an all-round view of how reverse-engineering (RCE) is performed. It requires no knowledge learnt beforehand – maybe except for more or less common mechanics of how computer works and what WinAPI is. You don’t even have to be able to write assembly code – you’ll learn this and more things as you go through the pages.
It’s ironic that none of the people (by the time of this writing – two) whom I’ve send this
guide to actually completed it – albeit they’ve asked themselves if I can teach them some
hacking stuff. In fact, I don’t know if they have started at all, haha… Well, nobody is
to be blamed for this, of course.
But, still, I would highly appreciate any feedback that you might drop in the comments!
Survey (started May 2013): if I were to write an e-book specifically targeting VN translations would you buy it?
I'm a bot • No, I’d just DL it • Yes, for $5 • Yes, for $10 • Yes, for $15 • Yes, for any reasonable price (up to $40).
Please be honest!
Oh, and yes, if you spot a typo or a mistake while reading the tutorial please select it in your browser’s window and press Ctrl+Enter – it’ll send a message to me so I can fix it. Thanks!
Now without further ado let’s dive into the world of hacking…
Part 1 – the Bruteforce – let’s start! »
P.S. thanks to someone from Moscow who’s sent me about 50 typo reports – I’ve fixed them all, thanks a lot, pal :)
28 March 2013
SK
Hey, Proger_XP, I need your help so I’ve wrote you an e-mail. Please check it out. Thanks.
28 March 2013
Proger_XP
I have already replied to you.
30 March 2013
SK
Thanks, I’ll try your advice and reply to you if I success (or if I need something to ask :))
7 January 2013
Alexscri
Hey, just wrote you an email…hope you can get around to it, I`d really appreciate some help! :D
30 May 2012
Red
Right now I'm on page 9 of the tutorial and following along with the debugging. You say that «Let’s see, our first catch has hFile = 0x4C» after the first breakpoint, but you don't explain where to find this information. I've been looking and looking but don't know how you know what value each of the function's parameters has. Could you please explain the step before this?
30 May 2012
Proger_XP
hFile changes on each call to CreateFile so you won't find it. Looks like this isn't very clear part of my guide since there was another comment on this question earlier – check out my answer.
30 May 2012
Proger_XP
Of course it does! Check the source:
In other words, EAX is overwritten 2 times and no more contains nNumberOfBytesToRead (it contains hFile). To get that argument simply set the breakpoint on the line with corresponding
asmpush eax.30 May 2012
Red
Alright, I was looking in the right place, then. See, when I hover over the eax next to the
asm; hFile commentabove ReadFile, it displays the value as just 0003.Here's a screenshot.
15 May 2012
Anonymous
I've done reading this tutorial, but I'm still not understand… :( . Specificly, if I want to translate a VN, what I must load into IDA or 101Editor (or somesoftware else) and modify?
15 May 2012
Proger_XP
You've read the entire thing and still have no idea?
The point in what is described here is to load game executable that interprets game scenarios and determine the exact algorithm it uses to extract scripts from game archives (if there are any), to determine how it converts scenario bytecode into operations and how it treats strings. Then you write your own tool to convert scenario into text files that after editing (translating) can be converted back into the format recognized by the game.
18 May 2012
Proger_XP
Sure, feel free to ask again if you still don't get something. And make sure to understand what and why I'm doing throughout the tutorial.
18 May 2012
Anonymous
I'll read it again; maybe àter this i can figure out something… Anw thank you!
18 March 2012
Anonymous
Proger! Im a beginner of translating visual novels. Right now, Im on part 3 (took me hours :P), then I got stuck. Not just stuck, im not sure if im doing it properly. I opened IDA and loaded scenario runner, but then, i got lost. im confused with the steps; it gets very complicated since the concept of this program is really confusing. Can you help or post a video tutorial.! ¶
22 March 2012
Proger_XP
Hey, ¶
Can you give me some details on what exactly confuses you? Or at least starting from which paragraph (in Part 3). ¶
14 February 2012
Proger_XP
@Sledge, @Pyrex Hey guys, I've sent you an e-mail – just in case they got lost along the way… ¶
9 March 2012
Proger_XP
Hey Sledge, are my mails still landing in your spambox? ¶
15 February 2012
Sledge
Hey Proger, I haven't received anything. It's dhuanco [at] gmail [dot] com ¶
15 February 2012
Proger_XP
Strange, my client says it's been sent. Check your spam or something – I've just repeated my message. ¶
27 January 2012
Proger_XP
By the way, Sledge, if you've got some free time on your hands don't you want to write a tutorial yourself? You were a beginner just recently and probably remember lots of things you were confused about but which someone like me has forgotten when writing this and other guides. ¶
I can assist you if you got for this. ¶
27 January 2012
Sledge
Hey, that's not a bad idea, Proger. But I'm afraid I haven't YET passed the beginner level. It's true I decoded a script format of a real VN game, but frankly the developers weren't really trying hard to keep it protected from hackers. I can tell that because I'm still trying to hack others VN's and all of them seem to use much more complex algorithms. But when I get hang of this whole stuff I'll definitely write my own «Guide to Exploring VN Script Formats» and use the VN's I hacked as examples. I must say I was lucky to find your tutorial, it gave me a headstart in understanding not only the Win32 API but also some basic concepts of debugging which I used to think was far more complex than it actually is. ¶
28 January 2012
Sledge
Sure. I suggested TeamViewer so you could check it without having to install it on your computer. The game's name is «Gibo Stepmother Sin» from Guilty http://forum.nihonomaru.com/eroge/169773-peach-princesss-gibo-stepmothers-sin.html ¶
28 January 2012
Proger_XP
I've used TeamViewer and it's a good piece of software. But first you should tell me the game's name so I can investigate a bit for myself. ¶
28 January 2012
Sledge
Well, as a matter of fact, I do need some help hehe! I'm a stuck at this game that has an endless loop that calls WinAPI function such as CreateFile and ReadFile, what bothers me is that it just never stops, it keeps calling these functions and always pointing to the same file! ¶
I've got an idea of how you could help me, we could use TeamViewer so I can share my computer screen (http://www.teamviewer.com/ – It's a free sotfware, and you don't have to sign up to use it) , then I would open IDA and start the VN game on it, and you see what's going on. What do you say? ¶
27 January 2012
Proger_XP
Big things start from small solutions. My first VN (Crescendo using IkuraGDL) might have been a bit harder than yours (it uses archives and some basic encryption) but still it wasn't protected very much either. ¶
But I shall not hurry you, definitely not. Just know that when you finally feel like giving it a try there's someone who'll gladly host your stuff and help with texts and such. ¶
Lol, it sounds so bombastic – sorry, I didn't mean anything like that -_-' ¶
Haha, then those two days of my life that I've spent writing this guide were not in vain :D That's wonderful. Welcome abroad and don't forget to ask things if you ever need some help :) ¶
25 January 2012
PyrexMaster
Hay Proger! I'm an extreme noob when it comes to… well… computer programming in general… Translation is more my field. But I want to learn how to take apart a VN so that I can create a patch for the game. However since I know nothing (not even basic mechanics of how computers work OR what winapi is) your tutorial is a little hard to understand. ¶
I have downloaded 010 hex editor and your scenario runner, opened the 'runme.dat' but then… I can't even find the extra «!» that you were talking about… I however did understand that when you delete it that it decreases the string length… though I'm not sure what 'exactly' that is (sorry I'm such a noob T_T) or how I was suppose to fix it… ¶
Please help me I desperately want to translate this game I am working on and I have found no other places that might be able to help… you are my only hope… – PyrexMaster ¶
25 January 2012
Proger_XP
Hey Pyrex! ¶
Nice to see you around. Don't worry, I'm extremely (that's true) excited to expain something to someone from «scratch» :) ¶
Please ask questions without second thought even if they look too noobish – it helps me understand what you and others need to learn. ¶
(Oh, and it's pretty good to be a Japanese translator… my respect.) ¶
Let's start with the hexeditor and how program usually handles strings. What you see before yourself when you open runme.dat you see exactly what my ScenarioRunner sees too. It means that it starts from the first byte ever – that is, 06. Then it sees 00, then 0E 00 and so on. ¶
In other words, it reads this file byte-by-byte. Actually, runme.dat is just like a story script or performance scenario. Why? Because if you remove some stuff from this file, change it or do something else the program will, depending on the changes (and thus depending on the file contents), also alter its own execution. Just like a stage performance – give the guys wrong sheet and they'll screw it up… Or, as in our case, replace some stuff here and there and «guys» will do it alright – but sing a different song. Talk in another language. Sounds familiar, eh? That's what we want ScenarioRunner or any other game to do – speak in English instead of Japanese. That's just one case of course. ¶
So yeah, the program reads some scenario file from its first byte (actually it can read it from an arbitrary position but it's safe to assume this simplest case), interprets those bytes and does something, then repeats. By interpreting here I mean that, say, byte with hex value FF doesn't mean anything by itself – this value won't make MS Paint output a message but when our program reads it it thinks: «Aha, my scenario instructs me to do command No. 255 – let's look up what it means…». Naturally, every program that runs scenarios has an opcode (operation code) table that maps codes to specific routines that actually output messages, read user input, etc. ¶
Now a bit deeper dive. Say, a program sees byte FF which means «output a message» – a message like you see in any Visual Novel. But wait – the program does not contain those messages, does it? It runs scenario file(s). So the message must be there. ¶
Naturally, as any scenario file is an octet stream («octet» is a clever name for «8 bits» which in our context means «byte»), or a data stream, or a stream of bytes. So when it reads that FF byte it must then read the string to output. Technically, nothing prevents the program from reading it from the end of scenario file, from another file or from some network location (but I haven't seen such crazy programs yet). Usually since programmers are in general sane people they will read the message to be output after the instruction's opcode. In other words, first goes opcode, then goes some string. ¶
Does it make sense so far? ¶
So we're speaking about this structure: ¶
For example: ¶
Or, in bytecode: ¶
First we see message opcode – FF – that makes the program output some message; then we see some stuff. Since the program reads file byte-by-byte from left to right (even Arabic programs don't read files from right to left) it sees FF first, then it reads string… but to which length? So 06 above is actually string's length – program reads this value, thinks: «Look, the string is going to be 6 characters long. Let's grab them all». So after reading 06 it reads all those 48 65 6C 6C 6F 21 bytes that actually correspond to ANSI codes of characters «Hello!». ¶
Now about this part. What I meant here was about this piece of runme.dat: ¶
Suppose we want to make program output «Do not cry…» instead of the above text. What is the most obvious way? Simply overwrite the above bytes (from «M» 4D to «!» 21) with the new string. In 010 editor, put the cursor on letter «M» on the right frame (narrow column with letters and symbols), make sure OVR, not INS is written in bottom-right corner of the program's statusbar and start typing. OVR means overwrite mode while INS means insert mode – you can toggle between them with the Insert key on the keyboard (near Delete and Home). ¶
You will type last period (of «cry…») and the cursor will stop on the exclamation mark that has left from the original string. What to do with it? Well, this is the point the tutorial can pick you up on: ¶
Good start! Now push harder and break the language barrier :) ¶
27 January 2012
Proger_XP
@PyrexMaster ¶
Hm, strange, there doesn't seem to be much new info since all these words are synonyms in our context: octet = byte = 8 bits. But I see what you mean, will keep this in mind. Thanks. ¶
This is sorta hard because my TZ is GMT+4 and we have 9 hours in difference. Maybe e-mail is better if you're can't be online from 2:00 EST to 10 EST. My e-mail is proger.xp@gmail.com. ¶
About NScripter – if it's the engine your game uses then you're lucky because it's old and its format has been unraveled to the last bit during the years. You need two tools: ¶
@Sledge ¶
I'm absolutely delighted to see you, a grown hacker. There's a long path ahead if you'll take this path :) ¶
27 January 2012
PyrexMaster
that's cool… the engine for the game I'm running is Nscripter… game I'm translating
> http://vndb.org/v581 but yea… I'll check out that game… ¶
27 January 2012
PyrexMaster
Sorry I missed you I was busy doing some stuff for my parents but I will be on all day tomorrow but if you could be there around 6:00 pm EST that would be good… ¶
also an example of what I ment was this: ¶
«Naturally, as any scenario file is an octet stream («octet» is a clever name for «8 bits» which in our context means «byte»), or a data stream, or a stream of bytes.» ¶
It's like you're throwing in a whole extra sentence into the middle of a current sentence and it makes it very hard to understand… =( ¶
27 January 2012
Sledge
PyrexMaster, after you practice with Proger's ScenarioRunner, and feel like you're ready to move on, you could start trying to hack a real VN game. I've discovered a VN game that's actually very easy to hack, Fatal Relations, the algorithm that decodes the scenario script is easy to understand (I'm a beginner too), and you can crack it with just a few lines of code. ¶
26 January 2012
Proger_XP
Which braces exactly do you mean? ¶
I'll check your link when I've got some time, perhaps tomorrow. Thanks for your visit. ¶
26 January 2012
PyrexMaster
Ok I understand most of what you said. (though if you could add more examples and explain things that you put in «()» at the end of the sentence that would be appreciated.) ¶
Could you please come to http://piratepad.net/6N4WvdVdSH This is where I've been working on my translation and keeping everything that I have translated. It has a IM chat on it and I would like to talk to you more In-depth about this kind of thing. =D I would be very appreciative if you could spare the time to talk to me. Thank you. ¶
9 January 2012
Sledge
Hey Proger! Here I am, in need of your assistance again! ¶
This is the routine that decodes the script: ¶
This is the code I came up with in C language. Do you think it's been coded correctly? ¶
int decode(char * &data) { // I'm not sure signed char is the best data type for the low // piece of a Extended Register signed char al; signed char bl = 0xA0; signed char dl = 0x4B; // for now, just want to decode 500 characters for(int i=0; i < 500; i++) { // Start Loop al = dl + bl; bl = *(data) - al; *(data) = bl; data++; bl = dl; } return 1; }9 January 2012
Proger_XP
You're getting hang of this whole thing, Sledge :) Your code seems fine to me although generally you should try to give variables more «human» names (even if you don't fully understand their purpose). For example, I would name al as byteKey, thisKey or even salt and v1/v2 as key1/key2. ¶
Since you're posting a lot lately let me give you a brief markup legend: ¶
bold italic under
super
strikeThe Oogle ¶
Code is created by placing %% around the text and can be block (each %% is placed on separate lines) and inline. You seem to already know that :) But you can highlight the code too by putting braces with the syntax scheme after the opening %%: ¶
%%(php)echo 'code';%% %%(delphi)WriteLn('Hi!');%% %%(asm)IMUL EAX, 03h%%PHPecho 'code';pascalWriteLn('Hi!');asmIMUL EAX, 03h¶Block code (works with any syntax scheme, e.g. those above): ¶
Currently there's no C highlighter, though. ¶
9 January 2012
Proger_XP
You've made it right in the code but not in the comment :) ¶
I'm not sure why you're using unsigned char, though – there's no difference as long as you're using bitwise ops but it still look redundant. ¶
This is usually how games work – most of them use symmetric or almost symmetric de/coding which is both easier to program and reverse-engineer. ¶
You can name it «transform», for instance. ¶
9 January 2012
Sledge
…should have used the code tags, sorry, there you go: Erase this post later if you can please. ¶
Haha, nevermind that :) – Proger_XP ¶
9 January 2012
Sledge
Hey Proger! I thought I was going to get into big trouble to figure out how to (re)encode the modified ASCII text, but all it took me was changing ONE line in my code. See: ¶
int decode(char * data, int len, bool decode=true) { int8_t al; int8_t v1 = 0xA0; int8_t v2 = 0x4B; for(int i=0; i < len; i++) { al = v1 + v2; if(decode) *(data) = *(data) - al; else *(data) = *(data) + al; data++; v1 = v2; v2 = al; } return 1; }Now the decode function works both for encoding and decoding, I just have to set the 'decode' parameter, by the way I should change this function name too since it now does both things. ¶
9 January 2012
Sledge
I finally discovered where the mistakes were on that C code. I was using the reference '&' symbol on the parameter, in the function signature. I forgot this is not necessary since pointers are already a reference. ¶
Here's the code: ¶
int decode(char * data) { int8_t al; int8_t bl = 0xA0; int8_t dl = 0x4B; for(int i=0; i < 5000; i++) { al = dl + bl; bl = *(data) - al; *(data) = bl; data++; bl = dl; dl = al; } return 1; }Again, the original routine: ¶
Now the problem is, this routine doesnt decode the WHOLE script, it seems that the script file is actually an archive, split in many parts which are decoded only when it's needed. ¶
8 January 2012
Sledge
Hey what's up! After hours breaking my head trying to debug a VN game, I was getting crazy on why I couldn't find the ReadFile function on Ida's Import List, then I found out that it has a different name: _hread. ¶
The name is different but I know it's the same win32 function ReadFile because of the parameters, they're exactly like ReadFile. Not only that, but other functions from the win32 api also have different names: like CreateFile is _lcreat, FileSeek is _llseek. It took me some time to figure that out, I didn't think these functions could have different names. And with that I could finally find the decoded script in pure ASCII format at some memory address. Beautiful! ¶
8 January 2012
Proger_XP
Hey, Sledge! ¶
You know what, you're actually making progress. Now that you've found the decoded script you've made half of the road. And I can see plain texts there so the game doesn't seem to encode them individually in any way (for example, some games use a modification of ShiftJIS that encode kana as 1 byte instead of 2 to squeeze texts). ¶
The reason why you're seeing those «_» functions is that they are 16-bit versions of CreateFile and friends: _lcreat function. This means they're really really old, even older than Win9x in some way. I even doubt that the game will run on Vista and above since 16-bit support has been removed from the core. ¶
Have you found an extremely old game or is it using those functions as a surprise for the hackers? ¶
8 January 2012
Proger_XP
You're welcome :) Positively, you should finish off your reverse-engineering work with some handy tool. But before you start to write one make sure you understand the algorithm – for example, by doing calculations on paper or in mind. ¶
For complex algorithms you would write the tool along with uncovering the algorithm but this isn't the case with most simple XOR cryptings so understanding the thing beforehand can save you some trouble later. ¶
p.s: and, btw, if you want I can put your tool on my page when it's finished :) ¶
8 January 2012
Sledge
What's even funnier is that I remember having problems to run this game on WinXP, but not on Win7, so maybe it supports these old junk code better than XP, which is an old OS by now. Anyway, I'll try to code something in C++ that does exactly what that looping routine does with the script. Your documents have been a great help on this road! Thanks again, and I'll sure be coming back here! ¶
8 January 2012
Proger_XP
Hm, so 16-bit support is still there in Win7? Well, perhaps a small part of it such as old procedure names. They haven't bothered about WinXP support (virtual machine doesn't count), I doubt they would about Win 3.x. ¶
Anyway good to know indeed. Keep the pace up! You've found good studying material it seems to me :) ¶
8 January 2012
Sledge
Yes. It's indeed an old game. Fatal Relations from C's ware, probably from 1995. But it's running fine on Windows 7 (and I didnt even set compatibility mode). So that's why it has different names… good to know! ¶
8 January 2012
Sledge
7 January 2012
Sledge
Please correct me if I'm mistaken: for every decryption algorythm, there's gotta be a loop that executes x times, x being the size of the encoded file (in bytes), right? ¶
7 January 2012
Proger_XP
Usually yes although a game might use standard Windows CryptoAPI or some other external function but I haven't seen one doing this yet ¶
However, this is usually more complicated. To give you an idea: ¶
For example, Sono Hanabira uses XOR encryption like this: ¶
Another game, PatchuCon, uses a multitude of XOR tables and runs the same data over different kind of XOR loops several times. You can get the idea of the amount of xorring by checking my tool's code. ¶
Planetarian is definitely a killer becomes it uses some mad calculations to decode its data. I haven't worked out the full mechanics but even Haeleth (who's created RLDev) is using a hook in his localization. ¶
And completely crazy engines like WARC (.war files) even contain EXE files in some of its archives that they call (after it has been decrypted) to decrypt the rest of the archive. So they contain no decryption themselves. ¶
But don't worry – there are more simple encryptions than there are complex :) ¶
Usually, if the game uses XOR (and most of them are since it's simple) you'll first stumble upon a loop which, as you trace it step-by-step, produces decoded bytes that you're able to see elsewhere (for example, when the game is already running and you're dumping its memory to look for loaded data). Then you find out how the loop works and then reverse-engineer it to the top loop(s) that produce XOR table(s) or somehow else control the decryption. ¶
Btw, you can do memory dumps with LordPE. ¶
3 January 2012
Sledge
Is it normal for a VN game to call the CreateFile function countless times? I set a breakpoint on the imported function CreateFile, so I keep pressing F9 after each time IDA stops execution because of the BP at CreateFile. And what's even worse is that the first 4-byte address on the stack points to the SAME filename. And that's it. It just never ends, unless I disable this breakpoint. What could this be? ¶
3 January 2012
Proger_XP
You're saying CreateFile args are not changed? Have you tried looking around the code, perhaps check the caller, what it does and why it loops? ¶
It's normal for a game to call CreateFile multiple times (say, 5) even with the same file name here and there but it definitely should not be stuck. ¶
28 December 2011
Sledge
Hey! Can you tell me an english VN game that you cracked its script format, so I can practice? ¶
29 December 2011
Proger_XP
You can pick any engine from my VN tools page (vn.i-forge.net). You can pretty much determine which engine a game uses by looking at its archives' file extension or using some extract tool (like AnimED, ExtractData or CRASS) – usually it'll tell you the name of the format. ¶
Sometimes you can also see the engine name used in the game exe's File Info dialogue (under Version, Comment or other field). ¶
Some commonly used engines I've primarily dealt with: ¶
That's what I can come up with ATM. If you have any questions don't hesitate to ask, now or later :) ¶
4 July 2011
Anonymous
hey, could you tell me what tutorials on Reverse Engineering you read so I can learn from them too? I've been trying to crack this script file from a VN game without sucess for weeks. Maybe if you lead me to the documents you've read, then I'll have better results! Thanks in advance! ¶
5 July 2011
Proger_XP
Yeah, there's a lot of debugging practice with IDA and ScenarioRunner is a program I wrote myself that mimics the structure and behaviour of a typical visual novel/other scripting game. ¶
Please don't hesitate to ask any questions anytime should they arise. ¶
5 July 2011
Anonymous
I see. I started reading your tutorial just yesterday. I only knew the basics ASM in theory, it seems like your documents provide some practice examples throught the mentioned softwares (IDA debugger) and the Scenario Runner program, which will be great for my study. Thanks. ¶
4 July 2011
Proger_XP
Frankly I haven't read any hacking tutorials myself :) Like I'm saying in the tutorial one day I just woke up and understood how things worked, then I tried to hack a few games to polish my skills. ¶
Did my tutorial help you in any way? If there's something unclear I might be able to explain. Also note that there are games with strong protection so you might have been out of luck picking one of them. ¶