This fragment is about to be reported (you'll remain on this page): You can enter a comment to clarify the mistake if you would like to: |
Preparatory stage - our first try where we attempt bruteforcing a scenario.
VN hacking. That’s a very interesting and broad theme, much because of its puzzles that sometimes actually make our mind work very hard :) I’ll try to show you the basics of RCE and hacking overall in this chapter.
The test script interpreter is here. Download it now but try not to look at the source code yet :)
What we need now is a hex editor. If you don’t have one I suggest you get 010 editor (a trial will do for now) – some time ago I used WinHex but it sucks when working in Japanese locale – it almost becomes unusable. I tried this packed-with-features tool and it turned out to be quite good.
See runme.dat in the Scenario Runner’s directory? That’s a «scenario» that it runs. Run the EXE. The scenario is simple, as well as the interpreter itself – it outputs a string, then asks you a question and depending on your answer it will either ask you the same question again or output a message and exit.
Your job is to change the strings it outputs – that’s what we do when we need to translate a game, for example.
We’ll go brute force for the start. Open runme.dat in a hexed. You can clearly see some strings like «Make me laugh!» and others. What if we change one of them?
Edit the first string – I made it «Do not cry…» – in this case we have an extra «!» left from the old string – delete it. The file will become 1 byte shorter.
When you edit in a hexed there are two areas where you can do this – on the left you should enter hex values (0-9, a-f) while on the right you can enter letters and other symbols normally. Since we need to enter a «human» string it’s more convenient to do that on the right side.
Btw, you can switch between two sides by means of Tab.
Also, note that there are two modes – overwrite and insert. You can switch between them as you usually do in normal text editors – by pressing Insert key.
Run the exe. Oh huh, we get some exception and also we see that it output some «☺» – while it’s good to stay positive it’s not exactly what we expected to see :)
We probably forgot to update something, for example, a string length. Naturally, the app should know how long the string is. How would it learn that?
Remember about two general types of strings – C-style and Pascal-style. As you know, C-style don’t have any length field, the length is determined on run-time by finding a char with code #0: 32 33 34 00. Pascal-style don’t use null-chars, it rather includes string length before the actual string, which makes string processing much faster (for the cost of having an additional length field but it’s a very low cost):
03 00 32 33 34. Usually that field is 2 (word) or 4 (double word) bytes long.
We need to look around to find where the problem is. Can you do that on your own? :)
Here’s what we have got: 0E 00 44 6F 20 6E 6F 74 20 63 72 79 2E 2E 2E.
Those 0E 00 bytes look exactly like the string’s length, right? We need to do a quick conversion to check this supposition: hit F11 and enter 0E in the hex field – it’s 14 in decimal notation. Actually, you could also do this by setting a pointer before 0E and looking at the Inspector under Unsigned Short.
If you’re using a different hexed or something else prevents you from using a desktop notation convertor tool you can use an online base convertor such as this one at i-Tools.org.
A quick intro about how numbers are stored in memory and other locations.
One thing I found confusing about them in the beginning was that they are stored in reversed order, e.g. if we have a 2-byte number and it’s 255 it will look like FF 00 in machine representation, not like 00FF as we’d write it.
In fact, this is called a little-endian or Intel byte order and there’s also big-endian byte order which will look like our normal notation, that is, as 00 FF.
However, Intel byte order is most widely used (one place using big-endian that I know about are network transfers).
By the way, before doing anything on your subject make sure to backup original files in case you mess them up too much – this is always a good habit to acquire.
Update that length and run the script. Wow, that’s cool, we now got a working script! Yahoo!
…well, not exactly as it turns our after a bit of investigation – the script works normally when the second choice is made on the branch, otherwise it will crash. It says that the bytecode is corrupted…
Now I’ll leave you to explore the script with the hexed on your own so you can try to find where the trouble is. When you’ve used up all your deductions carry on to the most fascinating part – debugging :D
Survey (started May 2013): if I were to write an e-book specifically targeting VN translations would you buy it?
I'm a bot • No, I’d just DL it • Yes, for $5 • Yes, for $10 • Yes, for $15 • Yes, for any reasonable price (up to $40).
Please be honest!
8 November 2012
Anonymous
I can't make it work... So that part is all going well «and enter 0E in the hex field – it's 14 in decimal notation.», but then when I open the script doesn't appear the questions anymore, only «corrupted». Am I not seeing something?
8 November 2012
Anonymous
Ops, I'm sorry, I just made it. It's work fine now. >_<
29 September 2012
Astarotte
What exactly is a String? You told us to change the byte from 14 to 13, but how exactly did you get the numbers 14 and 13? Where or what did you count?
29 September 2012
Proger_XP
A «string» is a series of bytes that are character codes, see ASCII. 14 and 13 numbers above are not character codes but rather the string's length that is often written before the first character so that the program knows how long the string is going to be and how many bytes it should read. Check the comment that's just below yours and my replies to it.
30 September 2012
Astarotte
Oh that makes sense. Your reply was very informative – thanks!
29 September 2012
Proger_XP
Reasonable question. The point about ASCII and other character sets is that they not only map printable symbols (Latin letters, digits, punctuation, etc.) but also so-called control characters – they have codes 0-31 (32 stands for space). You can check the ASCII charset table to see what I mean.
Control characters include line breaks (codes 10 and 13, for \r and \n if you're familiar with these symbols), tabulation (code 9) and so on.
So naturally since these characters are nothing we can see (we can't «see» a line break as well as we can see, say, letter «A») hex editors replace them with something visible. Some editors can be customized to use custom placeholder symbol (some use space, for example) but 010 Editor can't as far as I know and always uses periods for characters with code below 32.
To answer your possible question – yes, you can't distinguish between a real period («full stop», code 46) and placeholders – but this isn't a problem because you have a raw hex dump on the left so check if that symbol has the code of 46 and if it does it's a full stop, otherwise it's a control char.
That said, normal text editors like Notepad or web browsers replace such symbols with squares or question marks.
29 September 2012
Anonymous
I see – thank you for your quick response! I am really excited to learn the basics of Visual Novel translating and I am very grateful of you for providing this guide.
One more question – on the hexed, it says «…Do not cry…» in the printable symbols window. I noticed that the periods [.] do not show up when the .exe is executed, and do not seem to count towards string length. What is/are the functions of [.], if there are any?
5 May 2012
Anonymous
Hey, how can I «update the length» of that line?
Here is what I've got (like you):
5 May 2012
Proger_XP
Hey Anon,
The 0E 00 bytes are string length (0E in hex notation is 14 in decimal). Since your new string is «Do not cry…» which is 13 chars long you need to change these 2 bytes to 0D 00.
15 May 2012
Proger_XP
I don't know what do you mean by «default length» but yes, the aim is to change 0E (14) into 0D (13). This way the program will read string with length of 13 bytes instead of 14 (which you've shortened).
15 May 2012
Anonymous
That means I must change the default length to 13 instead of 14?
23 March 2012
Dedication
Its me again :). There was an option where you could edit the hex in just plain English characters. Can that work too? ¶
23 March 2012
Proger_XP
If you're talking about View → Edit As → Text (Ctrl+H) then yes – when you're editing texts (sometimes you need to edit raw bytes and that mode won't be useful). ¶
27 March 2012
Dedication
thanks! ¶
18 March 2012
Dedication
I do not understand this step. I am unsure about the string length part, is there a video tutorial you could put up? I get the general idea; its just that i dont understand how change the code that it could make the byte shorter. ¶
20 March 2012
Proger_XP
Sorry, I've only read your comment today. Yes, I'll make a short video tomorrow illustrating my point so stay around. ¶
Best regards, Proger ¶
23 March 2012
Dedication
i will see :3 ¶
21 March 2012
Proger_XP
I've just made the video – you can check it now although from your later comments it looks like you've already got over that problem – which is a good thing :) ¶