Reading and Writing File Segments

Note

The documentation has a new home: Check it out!

Segments are the basis of everything in AsmResolver. They are the fundamental building blocks that together make up a binary file (such as a PE file). Segments are organized as a tree, where the leaves are single contiguous chunk of memory, while the nodes are segments that comprise multiple smaller sub-segments. The aim of segments is to abstract away the complicated mess that comes with calculating offsets, sizes and updating them accordingly, allowing programmers to easily read binary files, as well as construct new ones.

Every class that directly translates to a concrete segment in a file on the disk implements the ISegment interface. In the following, some of the basics of ISegment as well as common examples will be introduced.

Basic Data Segments

The simplest and arguably the most commonly used form of segment is the DataSegment class. This is a class that wraps around a byte[] into an instance of ISegment, allowing it to be used in any context where a segment are expected in AsmResolver.

byte[] data = new byte[] { 1, 2, 3, 4 };
var segment = new DataSegment(data);

While the name of the DataSegment class implies it is used for defining literal data (such as a constant for a variable), it can be used to define any type of contiguous memory. This also includes a raw code stream of a function body and sometimes entire program sections.

Reading Segment Contents

Some implementations of ISegment (such as DataSegment) allow for reading binary data directly. Segments that allow for this implement IReadableSegment, which defines a function CreateReader that can be used to create an instance of BinaryStreamReader that starts at the beginning of the raw contents of the segment. This reader can then be used to read the contents of the segment.

byte[] data = new byte[] { 1, 2, 3, 4 };
IReadableSegment segment = new DataSegment(data);

var reader = segment.CreateReader();
reader.ReadByte(); // returns 1
reader.ReadByte(); // returns 2
reader.ReadByte(); // returns 3
reader.ReadByte(); // returns 4
reader.ReadByte(); // throws EndOfStreamException.

Alternatively, a IReadableSegment can be turned into a byte[] quickly using the ToArray() method.

byte[] data = new byte[] { 1, 2, 3, 4 };
IReadableSegment segment = new DataSegment(data);

byte[] allData = segment.ToArray(); // Returns { 1, 2, 3, 4 }

Composing new Segments

Many segments comprise multiple smaller sub-segments. For example, PE sections often do not contain just a single data structure, but are a collection of structures concatenated together. To facilitate more complicated structures like these, the SegmentBuilder class can be used to combine ISegment instances into one effortlessly:

var builder = new SegmentBuilder();

builder.Add(new DataSegment(...));
builder.Add(new DataSegment(...));

Many segments in an executable file format require segments to be aligned to a certain byte-boundary. The SegmentBuilder::Add method allows for specifying this alignment, and automatically adjust the offsets and sizes accordingly:

var builder = new SegmentBuilder();

// Add some segment with potentially a size that is not a multiple of 4 bytes.
builder.Add(new DataSegment(...));

// Ensure the next segment is aligned to a 4-byte boundary in the final file.
builder.Add(new DataSegment(...), alignment: 4);

Since SegmentBuilder implements ISegment itself, it can also be used within another SegmentBuilder, allowing for recursive constructions like the following:

var child = new SegmentBuilder();
child.Add(new DataSegment(...));
child.Add(new DataSegment(...));

var root = new SegmentBuilder();
root.Add(new DataSegment(...));
root.Add(child); // Nest segment builders into each other.

Resizing Segments at Runtime

Most segments in an executable file retain their size at runtime. However, some segments (such as a .bss section in a PE file) may be resized upon mapping it into memory. AsmResolver represents these segments using the VirtualSegment class:

var physicalContents = new DataSegment(new byte[] {1, 2, 3, 4});
section.Contents = new VirtualSegment(physicalContents, 0x1000); // Create a new segment with a virtual size of 0x1000 bytes.

Patching Segments

Some use-cases of AsmResolver require segments to be hot-patched with new data after serialization. This is done via the PatchedSegment class.

Any segment can be wrapped into a PatchedSegment via its constructor:

using AsmResolver.Patching;

ISegment segment = ...
var patchedSegment = new PatchedSegment(segment);

Alternatively, you can use (the preferred) fluent syntax:

using AsmResolver.Patching;

ISegment segment = ...
var patchedSegment = segment.AsPatchedSegment();

Applying the patches can then be done by repeatedly calling one of the Patch method overloads. Below is an example of patching a section within a PE file:

var peFile = PEFile.FromFile("input.exe");
var section = peFile.Sections.First(s => s.Name == ".text");

var someSymbol = peImage
   .Imports.First(m => m.Name == "ucrtbased.dll")
   .Symbols.First(s => s.Name == "puts");

section.Contents = section.Contents.AsPatchedSegment()                      // Create patched segment.
   .Patch(offset: 0x10, data: new byte[] {1, 2, 3, 4})                      // Apply literal bytes patch
   .Patch(offset: 0x20, AddressFixupType.Absolute64BitAddress, someSymbol); // Apply address fixup patch.

The patching API can be extended by implementing the IPatch yourself.

Calculating Offsets and Sizes

Typically, the ISegment API aims to abstract away any raw offset, relative virtual address (RVA), and/or size of a data structure within a binary file. However, in case the final offset and/or size of a segment still need to be determined and used (e.g., when implementing new segments), it is important to understand how this is done.

Two properties are responsible for representing the offsets:

  • Offset: The starting file or memory address of the segment.
  • Rva: The virtual address of the segment, relative to the executable’s image base at runtime.

Typically, these properties are read-only and managed by AsmResolver itself. However, to update the offsets and RVAs of a segment, you can call the UpdateOffsets method. This method traverses the entire segment recursively, and updates the offsets accordingly.

ISegment segment = ...

// Relocate a segment to an offsets-rva pair:
segment.UpdateOffsets(new RelocationParameters(offset: 0x200, rva: 0x2000);

Console.WriteLine("Offset: 0x{0:X8}", segment.Offset); // Prints 0x200
Console.WriteLine("Rva: 0x{0:X8}", segment.Rva);       // Prints 0x2000

Warning

Try to call UpdateOffsets() as sparsely as possible. The method does a full pass on the entire segment, and updates all offsets of all sub-segments as well. It can thus be very inefficient to call them repeatedly.

The size (in bytes) of a segment can be calculated using either the GetPhysicalSize() or GetVirtualSize(). Typically, these two measurements are going to be equal, but for some segments (such as a VirtualSegment) this may differ:

ISegment segment = ...

// Measure the size of the segment:
uint physicalSize = segment.GetPhysicalSize();
uint virtualSize = segment.GetVirtualSize();

Console.WriteLine("Physical (File) Size: 0x{0:X8}", physicalSize);
Console.WriteLine("Virtual (Runtime) Size: 0x{0:X8}", virtualSize);

Warning

Only call GetPhysicalSize() and GetVirtualSize() whenever you know the offsets of the segment are up to date. Due to padding requirements, many segments will have a slightly different size depending on the final file offset they are placed at.

Warning

Try to call GetPhysicalSize() and GetVirtualSize() as sparsely as possible. These methods do a full pass on the entire segment, and measure the total amount of bytes required to represent it. It can thus be very inefficient to call them repeatedly.

Serializing Segments

Segments are serialized using the ISegment::Write method.

ISegment segment = ...

using var stream = new MemoryStream();
segment.Write(new BinaryStreamWriter(stream));

byte[] serializedData = stream.ToArray();

Alternatively, you can quickly serialize a segment to a byte[] using the WriteIntoArray() extension method:

ISegment segment = ...

byte[] serializedData = stream.WriteIntoArray();

Warning

Only call Write whenever you know the offsets of the segment are up to date. Many segments will contain offsets to other segments in the file, which may not be accurate until all offsets are calculated.