Building a Road Network from OSM: Roads, Segments, and Directed Edges

A practical look at how a routing graph emerges from OSM nodes and ways

Building a Road Network from OSM: Roads, Segments, and Directed Edges
Photo by Dennis Kummer / Unsplash

The first shift happens earlier than most people think. Before restrictions, before costs, before routing algorithms, there is already a modeling step that changes the character of the data. In OSM, a road arrives as a way: an ordered list of node references with tags attached to it. That is a perfectly good source representation. It describes the road geometrically, and it can carry useful semantics such as road class, name, or one-way information. But a routing graph still cannot use that way as-is. The graph does not traverse a road in the loose map sense. It traverses directed edges.

That difference matters because a way in OSM is still too broad. It can run through many intermediate points, bend several times, pass multiple junctions, and continue for quite a long distance as one object. The routing graph needs something smaller and sharper. It needs traversable units between consecutive graph nodes, and it needs them in a direction that can actually be followed. That is why building the graph means segmenting a way into edge-sized pieces and making direction explicit. In graph terms, that is standard road-network modeling: nodes represent junctions or relevant points, and edges represent the road segments between them. Once movement rules matter, those edges are directed, because the legality of travel can differ by direction.

You can already see the raw material for that in a normal OSM way:

<way id="26659127" user="Masch" uid="55988" visible="true" version="5" changeset="4142606" timestamp="2010-03-16T11:47:08Z">
  <nd ref="292403538"/>
  <nd ref="298884289"/>
  <nd ref="261728686"/>
  <tag k="highway" v="unclassified"/>
  <tag k="name" v="Pastower Straße"/>
</way>

That way is meaningful as source data, but the graph still has to do the harder part. It has to turn the ordered node chain into traversable pieces. If the way runs from node A to node B to node C, the graph does not stop at this is one street. It derives one segment from A to B and another from B to C. And then it asks the more serious question: in which direction is each of those segments actually traversable? That is the point where the road network stops being a visual description and starts becoming a movement model.

Why a Way Is Not Yet an Edge

One of the easiest mistakes in this space is to look at an OSM way and assume it already corresponds to one edge in the routing graph. It does not. A way is a source object: an ordered chain of node references with tags. That is enough to describe a road in the map data, but not enough to define the traversable units the graph needs.

The important difference is simple. A way can contain several consecutive node pairs, and those node pairs are exactly where the graph begins to resolve the source object into smaller movement units. In our model, the graph does not traverse the whole way as one indivisible thing. It derives edges from the consecutive node pairs inside that way, and only after that does direction become explicit.

In a primal street network, intersections are modeled as nodes and street segments as edges. The source geometry is not yet the graph. The graph begins where the road is expressed in explicit edge-level units.

From Ways to Edges

An OSM way is already a street segment in source form. It is an ordered chain of nodes with tags. The important point is that it is still one source object even when it contains several intermediate nodes. The next step in the road network is therefore not to jump straight to directed movement. The next step is to resolve that way into edges between consecutive node pairs.

A small diagram shows that step directly:

way: A - B - C - D

edges:
(A,B)
(B,C)
(C,D)

That is the first structural transformation. The way remains the source street segment. The graph turns it into smaller edge-level units. Only after that does direction become explicit and the graph derive the traversable movements from those edges.

A second diagram makes the distinction clearer:

source way
A ---- B ---- C ---- D

graph edges
E1 = (A,B)
E2 = (B,C)
E3 = (C,D)

This matters because the graph should not attach movement semantics to the whole way when those semantics apply at a finer level. Access, one-way rules, restrictions, and costs need graph objects that are local enough to carry them precisely. The way is too broad for that. The edges are the first level where the road becomes decomposed into explicit local connections the graph can actually work with.

Street-network literature makes the broader point very clearly: in a primal graph representation, intersections are modeled as nodes and street segments as edges, and a usable topology requires separating true network nodes from intermediate geometry points that only shape the line of the street. That distinction is exactly what matters here. The source geometry is not yet the graph. The graph begins where the road is resolved into explicit edges between graph-relevant points.

That is the chain the rest of the model depends on. The way is the source street segment. The edges are the local graph units derived from it. The directed edges come one step later and represent the actual traversable movements the network exposes. Once that distinction is clear, the rest of the model falls into place much more cleanly.

From Edges to Directed Edges

Edges are still not the final thing the graph routes over. They tell us that two graph points are connected. That is already an important step, but it still does not say whether movement is possible from left to right, from right to left, or in both directions. For routing, that distinction is not secondary. It is the structure of movement itself.

A small diagram is enough to show the next step:

edge:
(A,B)

directed edges:
A -> B
B -> A

If the source semantics say that movement is only allowed in one direction, the result changes immediately:

edge:
(A,B)

directed edge:
A -> B

That is the whole point. The edge expresses local connectivity. The directed edge expresses actual traversable movement.

This matters because the graph is not supposed to answer whether two points are somehow adjacent in the source geometry. It has to answer whether a vehicle may move from one point to the next in a specific direction. A one-way street is the obvious case, but it is not the only one. The same local connection can later carry different movement semantics depending on direction. That is why direction cannot remain an interpretation outside the graph. It has to become part of the graph itself.

A slightly larger example shows how this grows out of the segmented source structure:

way: A - B - C
way: C - D

edges:
(A,B)
(B,C)
(C,D)

directed edges:
A -> B
B -> A
B -> C
C -> B
C -> D

The source road is not one undifferentiated object. OSM already separates it into ways, and those boundaries usually matter because semantics change there. The graph keeps that as part of the model. It derives edges from consecutive node pairs inside those ways, and then derives directed edges from those edges according to the movement semantics of that source segment. If the second way from C to D is one-way, then only C to D exists. The reverse movement does not.

Once the graph is in that form, a lot of later design decisions stop looking arbitrary. Transition constraints are defined over directed edges. Path constraints are ordered sequences of directed edges. A route itself is nothing else than a valid chain of directed movement through the graph. That is why I do not treat directed edges as a small technical refinement of edges. They are the level at which the road network becomes precise enough to talk about legal movement at all.

Why a Bidirectional Road Still Becomes Two Directed Edges

Once edges are in place, the next question comes naturally. If a road can be used in both directions, why not keep one edge and stop there. The answer is simple, but it matters. A bidirectional road still contains two different movements. One movement goes from A to B. The other goes from B to A. They share the same physical road and often the same geometry, but in the graph they are still different traversals.

Here a simple example:

edge:
(A,B)

directed edges:
A -> B
B -> A

This is not duplication for the sake of form. It is the graph stating movements explicitly. The road between A and B is usable in both directions, so the graph contains both traversals. If the source semantics say it is one-way instead, then only one directed edge exists. That difference belongs in the graph itself, not in some later interpretation layer.

This is also the point where the model becomes much easier to work with. An undirected edge can only say that two graph points are connected. A directed edge says something stronger: movement from this node to that node is possible. That is exactly the level later rules need. Access does not apply to an abstract connection in the middle. It applies to a directed movement. A turn restriction is not a property of a road in general. It is a restriction on moving from one directed edge into another. Even costs become cleaner at this level, because they belong to a movement through the graph, not to an undifferentiated line on the map.

That is why a bidirectional road still becomes two directed edges. The graph is not trying to mirror the visual simplicity of the map. It is trying to represent movement precisely enough that later semantics can attach without ambiguity. Once both directions are explicit, the rest of the network has a stable structure to work on.

What Directed Edges Make Possible

The real value of directed edges appears one step later, when the graph has to carry actual road semantics instead of just connectivity. Without them, the network can still show that two points are linked. With them, it can express the exact movements the road system permits. That is the difference the rest of the model depends on.

This is where the graph becomes usable for more than route calculation alone. A cost can belong to one concrete traversal. An access rule can deny one direction without affecting the other. A restriction can attach to the movement it actually governs instead of to a broader road object nearby. That makes the network much easier to inspect, explain, and expose through an API, because the structure already contains the same units a routing decision is built from.

That is also why the step is worth making explicitly. Directed edges do not add conceptual overhead to the graph. They remove ambiguity from everything built on top of it.

Cheers!

References

Subscribe to Rico Fritzsche

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe