2022.01.16 00:47

Windows assembly language primer for hackers

Later on, we will see how the code path can go non-linearly, covering concepts like if-then, loops, and function calls. There are actually eight bit and eight 8-bit registers that are subparts of the eight bit general-purpose registers. These features come from the bit era of x86 CPUs, but still have some occasional use in bit mode. Whenever the value of a bit or 8-bit register is modified, the upper bits belonging to the full bit register will remain unchanged.

The most basic x86 arithmetic instructions operate on two bit registers. The first operand acts as a source, and the second operand acts as both a source and destination. Many instructions fit this important schema — for example:. The bit shifting and rotation instructions take a bit register for the value to be shifted, and the fixed 8-bit register cl for the shift count.

Many arithmetic instructions can take an immediate value as the first operand. The immediate value is fixed not variable , and is coded into the instruction itself. For example:. Now is a good time to talk about one principle in assembly programming: Not every desirable operation is directly expressible in one instruction. In typical programming languages that most people use, many constructs are composable and adaptable to different situations, and arithmetic can be nested.

In assembly language however, you can only write what the instruction set allows. To illustrate with examples:. When performing bit shifting, the shift count must be either a hard-coded immediate value or the register cl. It cannot be any other register. If the shift count was in another register, then the value needs to be copied to cl first. There is a bit register named eflags which is implicitly read or written in many instructions.

In other words, its value plays a role in the instruction execution, but the register is not mentioned in the assembly code. Arithmetic instructions such as addl usually update eflags based on the computed result. Some instructions directly affect a single flag bit, such as cld clearing the direction flag DF. Comparison instructions affect eflags without changing any general-purpose registers.

Most of the time, the instruction after a comparison is a conditional jump covered later. So far, we know that some flag bits are related to arithmetic operations. Other flag bits are concerned with how the CPU behaves — such as whether to accept hardware interrupts, virtual mode, and other system management stuff that is mostly of concern to OS developers, not to application developers.

For the most part, the eflags register is largely ignorable. The system flags are definitely ignorable, and the arithmetic flags can be forgotten except for comparisons and bigint arithmetic operations.

The CPU by itself does not make a very useful computer. When storing a value longer than a byte, the value is encoded in little endian. When reading values from memory, the same rule applies — the bytes at lower memory addresses get loaded into the lower parts of a register. It should go without saying that the CPU has instructions to read and write memory. Specifically, you can load or store one or more bytes at any memory address you choose.

The simplest thing you can do with memory is to read or write a single byte:. When we write code that has loops, often one register holds the base address of an array and another register holds the current index being processed. Moreover, I've found that setting the ILT pointer to zero also works, although I am not sure if this behavior is officially supported. The Import Directory Table is terminated by an entry where all fields are equal zero.

During runtime, the entries of the IAT are replaced with the actual addresses of the imported functions. Each entry begins by a 2-byte hint which we'll ignore for now and a null-terminated string containing the imported function name, and a null byte if necessary , to align the next entry on an even boundary.

With that out of the way, let's see how we would define our executable's. The directive for a new PE section is already familiar to us. In this case, we're communicating that the section we're about to introduce contains the imports data and needs to be made writeable when loaded into memory since addresses of the imported functions will be written in there. The rva operator, unsurprisingly, yields the relative virtual address of its argument.

We're now almost ready to finally call ExitProcess. One thing we have to answer though, is - how does a function call work? Think about it. There is a call instruction, which pushes the current value of rip onto the stack, and transfers execution to the address specified by its parameter. There is also the ret instruction, which pops off an address from the stack and transfers execution there. Nowhere is it specified how arguments should be passed to a function, or how to handle the return values.

The hardware simply doesn't care about that. It is the job of the caller and the callee to establish a contract between themselves. These rules might look along the lines of: The caller shall push the arguments onto the stack starting from the last one The callee shall remove the parameters from the stack before returning.

The callee shall place return values in the register eax A set of rules like that is referred to as the calling convention , and there are many different calling conventions in use. When you try to call a function from assembly, you must know what type of calling convention it expects. The good news is that on bit Windows there's pretty much only one calling convention that you need to be aware of - the Microsoft x64 calling convention.

The bad news is that it's a tricky one - unlike many of the older conventions, it requires the first few parameters to be passed via registers as opposed to being passed on the stack , which can be good for performance. You may read the full docs if you're interested in details, I will cover only the parts of the calling convention relevant to us here: The stack pointer has to be aligned to a byte boundary The first four integer or pointer arguments are passed in the registers rcx, rdx, r8 and r9; the first four floating point arguments are passed in registers xmm0 to xmm3.

Any additional args are passed on the stack. Even though the first 4 arguments aren't passed on the stack, the caller is still required to allocate 32 bytes of space for them on the stack. This has to be done even if the function has less than 4 arguments.

The caller is responsible for cleaning up the stack. Let's go through the new lines one-by-one. In this case, we're subtracting 40 from the current value of the stack pointer note that somewhat counterintuitively, the stack "grows" downward, i.

Thus, we're aligning the stack to a byte boundary, and allocating a "shadow space" for the first 4 arguments in one fell swoop. How does this work? Well, before our entry point was invoked, the stack pointer was aligned to a byte boundary. As a result of the call, a return address was pushed onto the stack, diminishing the stack pointer value by 8 and throwing it out of alignment.

We need to subtract another 8 bytes to bring it into alignment again, and another 32 bytes to account for the shadow space, hence the value Here, we're setting the value of that register to zero by performing a bitwise exclusive-or operation with itself. The square brackets around the label name denote indirection - rather than calling the address referred to by the label, the value recorded in memory at that address is used as the target address for the call.

Of course, the label we're using is pointing to the location within the import table where the loader has written the address of the required function!

Fire it up in WinDbg again, run until our hardcoded breakpoint, then single-step to see how we eventually call ExitProcess, making note of how the rsp and rcx registers change. That's it for this first part.

Next time, we'll try to do something more interesting than just exiting the process :. Like this post? Just go through the first 2 videos in this video series. That is enough for understanding the memory layout. Buffer Overflow Megaprimer by Vivek Ramachandran. Exploit Research Megaprimer by Vivek Ramachandran. Real-time Exploitation of buffer overflow which will be very interesting, where exploitation is explained in stepwise clearly. You can even try it yourself as mentioned in the video for your practice.

Many people shy away from preparing for buffer overflows because it helps to exploit only one machine in the exam.

I have seen many people failing because of improper preparation on buffer overflows. Moreover, OSCP is not the target.

All the things you learn here is for the real world. Some Valuable Resources. OSCP is difficult — have no doubts about that! There is no spoon-feeding here.

Refer to all the above references and do your own research on topics like service enumeration, penetration testing approaches, post exploitation, privilege escalation, etc. Remember, always take notes as text with a separate note. They must be worked upon. But first you need to get started! So, if you are anywhere near the idea of attempting the OSCP, just enrol and get started. Once you are good with all the above pre-enrolling, you are fully ready to enrol for the OSCP. So, it is recommended to take 2 or 3 months lab.

As you already mentioned nowadays there are tools like Metasploit easing the task for you. But it can always come in handy knowing and understanding Assembly. Well to be a script kiddy.. Last updated: September 22, 38, views. Share Tweet Cor-Paul September 22, at pm. Infosec News September 23, at am. Navin September 23, at pm. William September 23, at pm. Hack to learn. Goodpeople September 23, at pm. I do agree with William and others..

Pantagruel September 24, at am.

vomensbecom1983's Ownd

0コメント

1000 / 1000