Testing & Quality

1

Why Testing Is Mandatory

Majormatic executes your logic exactly as you defined it. The platform does not evaluate whether the logic is professionally correct — it enforces that the output conforms to your declared schema. If your logic is wrong, the system produces well-structured wrong results.

Testing is the only mechanism that validates whether your workflow produces outputs that are professionally accurate, not just structurally valid. You must test before accepting the Digital Oath — the oath affirms that you have done this work.

The platform validates structure. You validate meaning.

result.json ensures the output has the right shape. Your test library ensures the output has the right content. These are separate validation layers and both are mandatory.

2

The Expert Test Library

The Expert Test Library is the set of reusable test scenarios you create for your app. It must be built before publication. Its purpose is to prove your app works reliably across the realistic range of inputs it will receive in professional use.

📋

Minimum 3 test cases

Platform review requires at least 3 test cases. Professional quality requires more. Build as many as needed to cover your realistic use range.

🔁

Reusable scenarios

Test cases should be structured so you can re-run them when you update the app. This allows you to verify that changes do not break existing functionality.

📊

Representative coverage

Tests must cover the realistic range of inputs — not just the ideal case. Include typical professional scenarios your users will actually encounter.

3

Test Types

Your test library must include all four test types. Omitting any type leaves gaps that platform review will identify — and that your users will eventually encounter.

Valid Input

Standard Professional Input

Test with complete, correctly formatted inputs that represent the typical professional use case. Verify the output is structured, accurate, and meets the quality criteria declared in result.json.

Example: A legal memo app tested with a complete case summary, correct jurisdiction, and clearly stated legal question. Verify the output sections are complete and professionally correct.

Invalid Input

Malformed or Incomplete Input

Test with inputs that violate validation rules — too short, wrong format, missing required fields. Verify the platform catches these before execution. These tests confirm your input.json validation rules work correctly.

Example: Submit with an empty required field. Confirm the platform blocks execution with a clear validation error, not a failed run.

Missing Data

Partial or Ambiguous Input

Test with inputs that are technically valid but professionally incomplete — cases where the AI may struggle because insufficient context was provided. Verify the output handles uncertainty appropriately.

Example: A document analysis app given a document with redacted sections. Verify the output acknowledges the gaps rather than fabricating information.

Extreme Cases

Edge Cases and Boundary Inputs

Test with inputs at the limits of your declared constraints — maximum file size, minimum text length, unusual characters, very short inputs, very long inputs. Verify the app behaves correctly at the boundaries.

Example: A summarisation app tested with a document at exactly the maximum file size. Verify execution completes within time limits and the output structure is intact.

4

Output Quality Standards

Every output your app produces must meet these quality standards. These are the criteria against which platform reviewers will evaluate your app — and the standards your professional users will rely on.

Accurate

The output reflects what the input data actually states. No fabrication, no unfounded assertions, no overclaiming.

Clear

The output is written in plain professional language appropriate to the target audience. Technical terms are used correctly.

Consistent

Running the same input twice produces structurally consistent results. Minor AI variation is acceptable; structural inconsistency is not.

Explainable

The user must be able to understand how the output was produced — what inputs drove which conclusions. Black-box reasoning is not acceptable.

What to Avoid

✗

Ambiguity

Outputs that could be interpreted multiple ways leave users unable to make confident professional decisions. Be explicit.

✗

Over-generalisation

Outputs that apply generic statements to specific inputs undermine professional utility. The output must address the specific case provided.

✗

Unverifiable claims

Outputs that assert facts without a traceable basis cannot be defended professionally. Claims must be supportable from the input data or cited external sources.

5

Pre-Oath Checklist

Before accepting the Digital Oath and submitting for review, verify each item in this checklist. This is the standard the platform review team applies.

✓

All four test types have been run and reviewed

✓

At least 3 test cases are documented and reusable

✓

Outputs are accurate, clear, consistent, and explainable

✓

supervision_required is set correctly in governance.json

✓

Output schema (result.json) declares all required sections and fields

✓

Professional caveats are included where outputs require expert review

✓

App description is accurate and does not overstate capability

✓

Edge case inputs have been tested and outputs are handled safely