Realistic Fake Data in Java

Realistic Fake Data in Java With DataFaker + EasyRandom practical
patterns for APIs, tests, and demos Wallace Espindola Solution Architect

Agenda Weʼll cover: • Why fake data matters • DataFaker
vs EasyRandom (quick compare) • Core patterns & recipes • API & testing workflows • Production-ish practices (observability, determinism) • Short demo plan • Wrap-up & resources

Why fake data matters What you gain: • Realistic demos
and prototypes (no more John Doe). • Stronger test coverage with varied inputs. • Repeatable load & chaos testing with large datasets. • Safer staging data without touching production records.

DataFaker vs EasyRandom Aspect DataFaker EasyRandom Goal Realistic, localized field
values Auto-populate full object graphs Best at Names, emails, addresses, business, etc. POJOs, nested objects, collections Usage faker.name().fullName() random.nextObject(User.class) Pros Believable data; locales Zero boilerplate for complex models Together Polish fields with realism Generate structure; then fine-tune

DataFaker: quick start import net.datafaker.Faker; Faker faker = new Faker();
String fullName = faker.name().fullName(); String email = faker.internet().emailAddress(); String address = faker.address().fullAddress(); // Locale example: Faker pt = new Faker(new java.util.Locale("pt")); String nome = pt.name().fullName();

EasyRandom: object population import org.jeasy.random.EasyRandom; import org.jeasy.random.EasyRandomParameters; EasyRandomParameters params =
new EasyRandomParameters() .seed(System.currentTimeMillis()) .stringLengthRange(5, 20); EasyRandom random = new EasyRandom(params); User user = random.nextObject(User.class);

Combine both: structure + realism User u = random.nextObject(User.class); if
(u.getId() == null || u.getId().isBlank()) { u.setId(java.util.UUID.randomUUID().toString()); } u.setFullName(faker.name().fullName()); u.setEmail(faker.internet().emailAddress());

API pattern (framework-agnostic) Principles: • All responses include a server-side
timestamp. • Prefer path variables for simple params (e.g., /users/{count}). • Offer GET for idempotent reads; mirror POST-only with safe GET when doing demos. • Document via OpenAPI / SwaggerUI for easy discovery. { "data": [ { "id": "...", "fullName": "...", "email": "...", "phone": "...", "address": "..." } ], "timestamp": "2025-10-07T12:34:56Z" }

Observability: timestamps everywhere Why timestamps help: • Correlate client logs
with server events. • Measure end-to-end latency in demos. • Easy debugging across distributed services. • Great for screenshot-friendly outputs.

Determinism: seeds & reproducibility Tips: • Use a fixed seed
to reproduce exact datasets in tests. • Use dynamic seeds for demos/live streams. • Keep a toggle to switch between deterministic and random modes. // Deterministic EasyRandom EasyRandom r = new EasyRandom(new EasyRandomParameters().seed(42)); // Deterministic DataFaker Faker deterministic = new Faker(new java.util.Random(42));

Localization & realism Make data feel real: • Pick locale
per environment or per request (e.g., ?locale=pt). • Mix locales for international datasets. • Ensure email/phone formats match the locale when showcased.

Generating large datasets CSV/JSON int n = 10_000; try (java.io.PrintWriter
out = new java.io.PrintWriter("users.csv")) { out.println("id,fullName,email,phone,address"); for (int i=0;i<n;i++){ User u = fakerUser(); // or EasyRandom + polish out.printf("%s,%s,%s,%s,%s\n", u.getId(), u.getFullName(), u.getEmail(), u.getPhone(), u.getAddress()); } }

Testing workflows Patterns: • Use EasyRandom for object graphs in
unit tests. • Override a few fields with DataFaker. • Create reusable factories (e.g., UserFactory) for clarity. • Keep seeds fixed in CI to avoid flaky tests. • Bundle a Postman collection for API checks.

Health & sanity checks Ideas: • Expose a health endpoint
that includes a timestamp detail. • Add a lightweight /ping returning { "ok": true, "timestamp": ... }. • Log a tiny sample of generated data at startup for quick visibility.

Keep it framework-agnostic Guidance: • The generation logic DataFaker +
EasyRandom) lives in plain Java services. • Controllers/resources stay thin; any web framework can host them. • Works great with Spring or Quarkus — code stays the same at core. • Focus on portability: DTOs (records) + minimal dependencies.

Demo flow Run-through: • 1 Hit GET /users/{count} → see
realistic data + timestamp. • 2 Toggle ?easy=true → object population differs slightly. • 3 Switch locales → names/addresses feel regional. • 4 Export 10k users → CSV, open in spreadsheet. • 5 Brief on seeds → re-run deterministic dataset.

Security & ethics Be careful with: • Never mix real
data with fake data in the same dataset. • Label fake data clearly in demos and logs. • If masking real data, ensure one-way transforms. • Document locale assumptions and format limitations.

Common gotchas Watch for: • Email format realism vs. deliverability
(donʼt spam domains). • Phone number formatting per region. • Long strings & edge cases (min/max length). • Nulls in nested objects when EasyRandom rules are too strict — tune parameters.

Cheat sheet (copy & paste) // DataFaker Faker faker =
new Faker(); String name = faker.name().fullName(); String email = faker.internet().emailAddress(); // EasyRandom EasyRandomParameters p = new EasyRandomParameters().seed(42).stringLengthRange(5,20); EasyRandom rnd = new EasyRandom(p); User u = rnd.nextObject(User.class); // Blend u.setFullName(faker.name().fullName()); u.setEmail(faker.internet().emailAddress());

TL;DR Takeaways: • Use DataFaker for realism. • Use EasyRandom
for structure. • Seeded randomness = reproducible tests. • Return timestamps for observability. • Locale-aware data makes demos shine. • Keep the core generation framework-agnostic.

Resources & next steps Try this next: • Add request
params: count, locale, seed. • Produce CSV/JSON/SQL dumps to seed environments. • Bundle Postman/HTTP files for the team. • Wire simple metrics around generation time/count.

Check source code available Spring-Boot implementation: • https://github.com/wallaceespindola/fake-data-quarkus Quarkus implementation:
• https://github.com/wallaceespindola/fake-data-springboot/

Let's stay connected LinkedIn: linkedin.com/in/wallaceespindola GitHub: github.com/wallaceespindola Twitter: @wsespindola Dev
Community: dev.to/wallaceespindola DZone Articles: dzone.com/users/1254611/wallacese.html Slides: speakerdeck.com/wallacese Medium: medium.com/@wallaceespindola Substack: wallaceespindola.substack.com Pulse: linkedin.com/in/wallaceespindola/recent-activity/articles/ Thank you!!!

Realistic Fake Data in Java

Realistic Fake Data in Java

Wallace Espindola

More Decks by Wallace Espindola

Other Decks in Programming

Featured

Transcript

Realistic Fake Data in Java With DataFaker + EasyRandom practical

Agenda Weʼll cover: • Why fake data matters • DataFaker

Why fake data matters What you gain: • Realistic demos

DataFaker vs EasyRandom Aspect DataFaker EasyRandom Goal Realistic, localized field

DataFaker: quick start import net.datafaker.Faker; Faker faker = new Faker();

EasyRandom: object population import org.jeasy.random.EasyRandom; import org.jeasy.random.EasyRandomParameters; EasyRandomParameters params =

Combine both: structure + realism User u = random.nextObject(User.class); if

API pattern (framework-agnostic) Principles: • All responses include a server-side

Observability: timestamps everywhere Why timestamps help: • Correlate client logs

Determinism: seeds & reproducibility Tips: • Use a fixed seed

Localization & realism Make data feel real: • Pick locale

Generating large datasets CSV/JSON int n = 10_000; try (java.io.PrintWriter

Testing workflows Patterns: • Use EasyRandom for object graphs in

Health & sanity checks Ideas: • Expose a health endpoint

Keep it framework-agnostic Guidance: • The generation logic DataFaker +

Demo flow Run-through: • 1 Hit GET /users/{count} → see

Security & ethics Be careful with: • Never mix real

Common gotchas Watch for: • Email format realism vs. deliverability

Cheat sheet (copy & paste) // DataFaker Faker faker =

TL;DR Takeaways: • Use DataFaker for realism. • Use EasyRandom

Resources & next steps Try this next: • Add request

Check source code available Spring-Boot implementation: • https://github.com/wallaceespindola/fake-data-quarkus Quarkus implementation:

Let's stay connected LinkedIn: linkedin.com/in/wallaceespindola GitHub: github.com/wallaceespindola Twitter: @wsespindola Dev