Message structure
How HL7v2 messages are organized — segments, fields, components, sub-components, and the delimiters that separate them.
An HL7v2 message uses a custom hierarchical format that resembles the EDI (Electronic Data Interchange) format. Understanding the hierarchy is essential to interpreting and processing messages — it is what allows healthcare systems to exchange and share information with consistent meaning.
The development of HL7v2 predates XML, JSON, and other modern data formats. As a result, HL7v2 messages present specific challenges when integrating with modern software.
Modern languages have limited HL7v2 support
Many modern programming languages and software stacks do not provide native libraries for HL7v2. Developers usually rely on third-party libraries — often poorly maintained — or write custom parsers. This adds complexity and friction to integration work.
The basic structure of an HL7v2 message is hierarchical and composed of segments, fields, and components. Each segment represents a logical unit of information, and each field represents a specific data element within the segment. Components and sub-components further structure complex data elements within fields. This hierarchical nature lets HL7v2 represent a wide range of healthcare data in a structured form.
Message segments
Segments are the building blocks of an HL7v2 message. There are many segment types, each serving a specific purpose. Each segment type is identified by a three-letter code and represents a logical unit of information. Common segments include MSH (Message Header), PID (Patient Identification), PV1 (Patient Visit), and OBX (Observation/Result), among many others. Each message event has its own determined set of segments.
Each event's segment list is documented in the HL7v2 events reference.
Hierarchical structure
HL7v2 messages have a hierarchical structure: segments appear in a predefined order within the message, and each segment can contain multiple fields. The order is dictated by the message type and follows a standardized sequence.
Consider the following HL7v2 message representing a patient admission:
MSH|^~\&|SendingApp|SendingFacility|ReceivingApp|ReceivingFacility|123456789||ADT^A01|MessageControlID|P|2.8|
EVN|A01|20220101080000||||
PID|1||RTH9876^^^MRN^MR~12345^^^SSN^SS||LastName^FirstName||
PV1|1|I|AssignedLocation||||||||||||The message above is composed of four different segments, each representing a specific type of information. Each segment occupies a separate line. This is because the segment delimiter is the carriage return character \r (Hex 0x0D) — and sometimes the line feed \n — both of which are not rendered as visible line breaks in many editors. We discuss delimiters in more detail later in this page.
In this example the segments mean:
| Segment | Title | Description |
|---|---|---|
| MSH | Message Header | The first segment in nearly every HL7v2 message. Contains essential information about the message itself: type, message control ID, sending and receiving application information, and the timestamp of message creation. |
| EVN | Event Type | Information about the event that triggered the message — event type code, event date and time, and other relevant details. |
| PID | Patient Identification | Demographic information about a patient: ID, name, date of birth, gender, address, contact details, and other identifying data. |
| PV1 | Patient Visit | Information related to a patient's visit or encounter: admission date, discharge date, visit number, patient class, assigned location, attending physician, and so on. |
For more details about ADT, see the ADT A01 message reference.
Required and optional segments
Some segments are required; others are optional. The MSH segment is required and must always be the first segment in every HL7v2 message. Beyond MSH, the list of segments — and which are required versus optional — varies by event type and trigger event. The ADT A01 message defines a specific list of required and optional segments in a specific order. In ADT A01, the EVN segment is optional, the PID segment is required, and the PV1 segment is optional.
| Order | Segment | Title | Optionality |
|---|---|---|---|
| 1 | MSH | Message Header | Required |
| 2 | EVN | Event Type | Optional |
| 3 | PID | Patient Identification | Required |
| 4 | PV1 | Patient Visit | Optional |
For the full list of segments and their optionality in ADT A01, see the ADT A01 message reference.
Other segment types
Other commonly used segment types and their roles:
| Segment | Title | Description |
|---|---|---|
| OBX | Observation / Result | Conveys clinical observations and results — laboratory tests, vital signs, imaging findings. Includes observation identifier, value, units, reference ranges, and result status. |
| PV1 | Patient Visit | Information about a patient's visit: admission date, discharge date, visit number, patient class, assigned location, attending physician. |
| ORC | Order Control | Information about orders and their status — order control code, placer order number, filler order number, order status. |
| DG1 | Diagnosis | Diagnosis-related information: diagnosis codes, descriptions, types, and related details. |
These are a few examples. Each segment type has a specific role in conveying a particular kind of information.
Message fields
Each segment can contain multiple fields. Fields are individual data elements within a segment that hold specific information. The number of fields varies depending on the segment type and what it represents. Fields are sequentially numbered and contain different kinds of data — names, dates, observations, results. The content and meaning of each field is determined by its position within the segment.
Field sequence
Fields within a segment are sequentially numbered and have specific positions in the segment structure. The receiver interprets and extracts data from each field by position. In the PID segment, for instance, field 5 represents the patient's name, field 7 represents the date of birth, and so on. The exact content and meaning of fields are documented in the segment reference for the version in use.
Fields are separated by the field delimiter — the pipe character | (Hex 0x7C) by default. We cover delimiters later in this page.
Optionality
As with segments, some fields are optional. But because fields are sequentially numbered, the position of a field must be preserved even when the field is empty. An empty field is represented by an empty string between two delimiters. If a patient's name is not available, the PID segment still contains an empty field 5; the position is preserved.
In the example below, the PID segment contains an empty field 6 — the patient's mother's maiden name is not available.
PID|1||RTH9876^^^MRN^MR~12345^^^SSN^SS||LastName^FirstName||How do we know field 6 is the mother's maiden name?
The HL7v2 reference documentation defines the structure and meaning of every field within a segment. The PID segment reference lists each field by position. The reference is the place to look up exact field semantics.
The number of fields within a segment is not fixed and varies by segment type and implementation. The PID segment can contain up to 30 fields, but a particular implementation may use fewer. The number of fields is governed by the message type and the trigger event.
Delimiters
Delimiters are special characters that separate fields, components, and sub-components within HL7v2 messages. They define the structure and hierarchy of data within a message. Unlike JSON or XML, HL7v2 does not use a predefined punctuation grammar — it uses a small set of delimiter characters at different levels of the hierarchy.
The delimiters are one reason HL7v2 can look opaque to a human reader, but they are essential for accurate parsing.
Custom delimiters
The delimiters used within an HL7v2 message are configurable and may vary depending on the implementation. In practice this is rarely done — the default delimiters are almost universal, and a real-world implementation with custom delimiters is unusual.
The default HL7v2 delimiters are:
| Delimiter | Character | hex | Description |
|---|---|---|---|
| Field | | | 0x7C | The pipe separates one field from another within a segment. |
| Component | ^ | 0x5E | The caret separates components within a field. In the PID segment, the patient name comprises last name, first name, middle name, and so on — separated by carets. |
| Repetition | ~ | 0x7E | The tilde separates repetitions within a field. The PID segment can carry multiple patient identifiers; each repetition is separated by a tilde. |
| Escape character | \\ | 0x5C | The backslash escapes special characters within a field. Use it when a value contains a literal pipe | or caret ^. |
| Sub-component | & | 0x26 | The ampersand separates sub-components within a component. In the OBX segment, a performer name may include sub-components — last, first, middle — separated by ampersands. |
| Segment | \r | 0x0D | The carriage return separates segments within a message. In some implementations, the line feed \n is used instead. The combination \r\n also appears. |
These delimiters define the structure and hierarchy of data within an HL7v2 message. Using and interpreting them correctly is what makes accurate parsing possible.
Adhering to the predefined order of segments and the expected positions of fields is essential when constructing or processing HL7v2 messages. Adherence to structure is what makes data integrity, accuracy, and reliable exchange possible.
HL7v2 is not a modern data format. It predates XML, JSON, and similar formats, and many modern programming languages and software stacks do not provide native support for it. That gap is the reason integration work in healthcare often costs more than it should — and the reason the standard remains a recurring challenge for engineers who encounter it for the first time.
The next page explores the flow of HL7v2 messages and how they are exchanged between systems.
Message types and trigger events
How HL7v2 names and categorizes the messages it exchanges, and how a single message type can carry many distinct events.
Message flow
How HL7v2 messages move between systems — from the event that triggers them, through transmission and processing, to the acknowledgment that closes the loop.