Article

SemanticFlow Is a Useful Reminder, Not the Product

Why a data agent should resolve semantics before tool execution and keep evidence attached to conclusions.

30 Apr, 2026

Many AI products still treat analysis as a chat problem.

Ask a question. Get an answer. Move on.

That works for a demo. It breaks down much faster in repeated analysis work, where the same questions need to be asked again, compared across time, and defended in front of other people.

SemanticFlow, as an open-source tool, is useful because it reminds us of a more complete sequence:

resolve the semantic meaning first
call tools second
keep evidence attached to the conclusion

That sequence sounds obvious. In practice, many systems still skip at least one of those steps.

Start with semantics

If a system does not know what a metric, dimension, entity, or business concept means, it is not really analyzing. It is guessing from prompts and nearby context.

That is fine when the question is casual.

It is not fine when the same question has to be answered week after week with the same business meaning.

Semantic work exists to remove that ambiguity before anything else starts.

Once the meaning is fixed, the system no longer needs to infer whether GMV is net or gross, whether channels are defined by source or attribution, or whether an “active user” is based on login, visit, or transaction.

The point is not to make the system clever.

The point is to make the question stable.

Then let tools do the narrow work

Tool use matters, but only after the semantic layer is in place.

Without semantics, tool choice becomes trial and error. With semantics, tool choice becomes bounded.

At that point, different tools have different jobs:

a database checks the facts
a file or spreadsheet supplies local context
a warehouse handles larger-scale exploration
a semantic catalog resolves definitions

The model should not be picking tools just because they are available. It should pick them because the current semantic problem requires them.

That is where a data agent starts to become useful instead of merely impressive.

Keep evidence next to the conclusion

The biggest failure mode in AI analysis is not always an incorrect answer.

It is a confident answer that no longer shows where it came from.

If the system cannot tell you:

what data it used
what data it excluded
what transformations happened
what assumptions are still open

then the answer is hard to trust, hard to review, and hard to reuse.

That is why a good analysis path should preserve evidence as part of the output, not treat it as an internal detail.

When evidence is incomplete, the system should say so. When the semantic definition is missing, it should say so. When the result is a judgment rather than a fact, it should say so.

That is not weakness. It is the minimum standard for analysis that other people can rely on.

What this means for Tukun

For Tukun, this is the direction we care about:

semantics are published and reused as first-class assets
tools are called after the meaning is clear
conclusions stay linked to their evidence path
limitations are explicit instead of hidden

We do not think a data agent should try to replace judgment.

We think it should remove the messy parts that make judgment unreliable:

ambiguous definitions
scattered evidence
repeated manual exploration
conclusions that cannot be traced back

That is a more ordinary idea than a flashy demo.

It is also the part that survives real work.