Endless Paradigm

Full Version: Basics of Hex Editing
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
I put this guide up, due to questions from people who aren't exactly experienced in hex editing.  Hopefully, this guide can clear a few things up :P  Although this is posted in a PSP section, this really applies to anything related to computers.
Warning: hex editing files can cause them to become corrupt (if you're not sure what you're doing).  It's a good idea to always keep a backup of the files before you edit (in fact, quite a few hex editors will make backups automatically for you, however, try not to rely on this)

Introduction
What is a Hex Editor for?
Okay, you've probably used Notepad before.  You open it up, type in some text, save it, then you can open it again all dandy and fine.  However, you may have also tried to open up EXEs or DLLs in Notepad?  If so, it'll probably look like a shrimped sushi, a butter-fried tempura and a temple full of sake...  Well not quite, but yeah, you get the idea.  Notepad can't really handle binary data.

What is a Hex Editor? [ Wikipedia Article ]
A hex editor is basically a file editor, like Notepad, except it's designed to edit binary data, whereas text editors (like Notepad) are designed to edit plain text.
A hex editor displays each byte of the file in hexadecimal format rather than the more familiar ASCII format.

What is Hexadecimal? [ Wikipedia Article ]
Simply, hexadecimal is a base 16 number system.  Wee use a base 10 number system (decimal) usually, however, in computing, base 16 is a lot more useful.  Hexadecimal numbers usually are preceded with "0x" to avoid confusion - eg 0x10 is the number 16.  There are various other ways of representing hex numbers, eg 10h or 1016, but I'll use the 0x notation.
So how does a base 16 number system work?
In our base 10 number system, wee have the digits 0 to 9.  Once wee get past 9, wee need to start using two digits to represent the number.  That is, at 10, wee start using two digits to represent numbers.  For a base 16 number system like hexadecimal, wee only start using two "digits" to represent the number after 15.  So how do wee represent the numbers 10 to 15 as single "digits"?  Wee use letters.
Here's a short table, showing the conversion between hexadecimal and decimal:

Hex        Dec
0x00        0
0x01        1
0x02        2
    ...
0x09        9
0x0A       10
0x0B       11
    ...
0x0F       15
0x10       16
Keep in mind how numbers work in a base 16 number system.
For example, for our base 10 number system, the number 258,654 really means:

258,654 = 2 × 106
        + 5 × 104
	
	
+ 8 × 103
	
	
+ 6 × 102
	
	
+ 5 × 101
	
	
+ 4 × 100
	
	

So, the base 16 number 0x59E2DC would have a decimal value:
0x59E2DC = 5        × 165
         + 9        × 164
	
	
+ 14 (0xE) × 163
	
	
+ 2        × 162
	
	
+ 13 (0xD) × 161
	
	
+ 12 (0xC) × 160
         = 5,890,780 (decimal)
	
	

Of course, manually doing all these conversions is cumbersome.  You can use the Windows Calculator to convert between the number systems.  Just open the Calculator, go View menu » Scientific, and you can switch between the two number systems via the option buttons (Hex and Dec).

Okay, so what's the point in hex editing?
Well it allows you to edit binary data without screwing stuff up like Notepad would, and do some basic (and even some more advanced) hacking. :P
Also helps you understand how things work.

Okay, so where can I get a hex editor?
There's a good list at Wikipedia: http://en.wikibooks.org/wiki/Reverse_Eng...ex_Editors

My personal favourite is UltraEdit32, however Hex Workshop is probably the most popular.  Note that both of these aren't free.  Some nice free hex editors are XVI32 and HxD.


Basic Hex Editing
I'll start off with some basic examples.  In these, I'll be using UltraEdit32.

Open Notepad, and copy the following text:
Quote:This is a demonstration text file to give you an idea of hex editing
Save this as a TXT file somewhere.  Now open up your hex editor, and open the TXT file in it.  Note that with some hex editors, such as UltraEdit, you'll need to switch to "Hex View" - in UltraEdit, just press Ctrl+H to do this.
You should now have a screen like the following:
[Image: scrhb1.png]

You'll notice the text that you entered is in the "ASCII Display region" - basically, this is the ASCII representation of data, and it should be nearly identical to what you would see in Notepad.
For your information ASCII basically is the standard text system - each character is assigned a number, for example, capital "A" is assigned the number 65 and lower case "a" is assigned 97 (note, you don't need to remember this! XD).  There are 256 ASCII characters.  If you want, you can find an ASCII table here.
Note that each character in the ASCII Display region corresponds to one byte.  I will refer to them as bytes hereafter.

Now to explain the other two sections.
The Location part pretty much displays the location of the first character in each line.  Note that locations are in hexadecimal (recall that the suffix "h" is the same as the prefix "0x").  You should see that there are 16 (or 0x10) bytes to a line in UltraEdit32 (may be different for different hex editors).
Also note that locations start at 0x00, that is, the location of the first byte in any file is 0x00.  The location of the 2nd byte in a file will be at location 0x01.  The location of byte number 52 will be at 0x33 and so on.  The location of a byte is also usually referred to as it's offset.
The Hex Dump part displays each byte (or character) of the file in hexadecimal.  As said previously, lowercase letter "A" is given the ASCII value 97 (0x61) - so if add/edit a byte to the letter "a" in the first line (try selecting it in the ASCII Display section), you'll notice that the Hex Dump value will correspond with the hex value "61".

So I still don't really get the point of a Hex Editor...
Try changing the first byte/character of that file to 0x00 (null character).  You'll need to click in the Hexadecimal display area, on the first hex digit, and then enter the numbers.  After you've done that, you should get something like this:
[Image: scrrl1.png]
Save the file, then open it in Notepad - what do you see?
For your information, a lot of binary files make use of the Null (0x00) character.  If you try to edit these with a text editor like Notepad, all you'll get is a mess.

Okay, so what can I do with hex editing?
Well, quite a bit.  If you've read up to this point, then you're basically saying that you're interested :P
You've probably already seen stuff which requires you to hex edit something.
Hex editing will also allow you to perform some basic hacking.  I figured out the RCO format via hex editing.


Basic Hex Editing Info
Numbers, more specifically, integers, basically make up computers.  So it comes to no surprise that knowing how integers are stored will play a big role.
Computers store information in bytes, however, since a byte is made up of 8 bits, they can only accept 28 = 256 different values.  Quite often, numbers are made up of multiple bytes, usually:

1 byte (Char or int8) [range = 28 = 256 values]
2 bytes (Word or int16) [range = 216 = 65,536 values]
4 bytes (DWord or int32 or long int) [range = 232 = 4,294,967,296 values]
8 bytes (QWord or int64 or long int64) [range = 264 = 18,446,744,073,709,551,616 values]
(note the term "word" is not to be confused with CPU word sizes)
Numbers are typically stored in reverse byte order (little endian format) in files, for example, the 4 byte number 0xFE842615 would appear as:
00000000h: 15 26 42 FE                                     ; .&Bþ

Signed integers are those which can take on negative values.  The primary method used to represent signed integers is two's complement.
The range of an n-bit signed integer is -2(n-1) to (2(n-1) - 1).
Let's take an 8 bit integer for example.  If it were unsigned, it could store the values 0 to 255.  A signed 8 bit integer, however, stores the values -128 to 127.  How does it do this though?  Basically, the numbers 0 to 127 are stored "normally", so:
0x60 (9610) still has the decimal value 96 regardless of being signed or unsigned.
However, the unsigned numbers 128 to 255 are different for signed numbers.  You can think of it with this formula:
Signed value = 256 - Unsigned value

Eg:
0xFF = 25510 (unsigned value) = -110 (signed value)
0x80 = 12810 (unsigned value) = -12810 (signed value)
0xBB = 18710 (unsigned value) = -6910 (signed value)

So... when you're hex editing, you may notice some sequence of 0xFFFFFFFF.  It's possible that this could be the value 4294967295, however, it's much more likely to be a signed integer with the value -1 (a common value used to represent something "invalid" or "not present" in programming).

Floating point numbers may also appear at times in files.  You may have figured that integers are limited to whole numbers.  Floating point numbers allow decimal points.  How floats work is beyond the scope of this guide - if you're interested, see Wikipedia.
I don't think there's an easy way to edit floating point numbers in UltraEdit, unfortunately.  You can use HexWorkshop instead - select the byte at which the float starts, then at the bottom of the screen, you should be able to find and edit the floating point representation of the number.
There are two standards for floating point numbers, single precision and double precision.  Single precision floats take up 4 bytes, whereas double precision takes up 8 bytes.

Common Terms
Char - usually 1 byte
Word - 2 byte integer
DWord - 4 byte integer - probably most often used
QWord - 8 byte integer - not commonly used

Offset - refers to a location.  Can be absolute (usually the case) meaning the position in the file.  Sometimes it can be relative, which refers to a difference, eg an offset of 0x05 from the location 0x02 would give the location 0x07
Nibble - 4 bits - yes half a byte is a nibble, didn't you know that?  Since one hex digit represents 4 bits, one nibble = one hex digit.
Header - a common thing used in files - usually a section at the beginning of a file which describes various things about the body of the file
Pointer - another thing which is often used, usually appears in header sections.  A pointer holds an address value (points to a certain location in the file), and is almost always a DWord.
String - refers to a "string" of text, or just basically, text.  For example, "Hello" is an example of a 5 character string.
Null - usually means 0 - eg a null character is a Char (1 byte integer) whose value is 0 or 0x00
Signed/Unsigned - Unsigned integers are always positive, whereas signed integers can be negative.


Anyways, there's some basic hex editing info.  Hope this helped.  Suggestions etc welcome.
nice tut zinga thanks

gonna put this to use
im wondering how to wildcard replacement on topmen、vsh to miminize the focus/shadow effect   that bstronga did ;)
^ Kinda off-topic but you mean batch replace the glow/shadow icons?
Great tut zinga, this will help me when I make my first about screen !

:great:
Holy bajeezus ZiNgA!  I've been waiting my WHOLE LIFE for a tut like this!!!
(Ok, a little over-enthusiastic, but the tut's an awesome contribution nonetheless - thanks heaps)!
ZiNgA BuRgA Wrote:^ Kinda off-topic but you mean batch replace the glow/shadow icons?
not using dummy pic for replacement, it's something like this, which bstronga did in vsh prx/topmenu rco
O_o, never seen that before.  He could've just nulled the label section I guess.

Thanks for comments guys :P
i learned alot form this tutorial.
nice tutorial zinga, this is pretty cool
i just use the hex editor part of notepad++ :P
Pages: 1 2 3
Reference URL's