AI Tinkerers – Hong Kong Online Meetup (December) - Benchmarking LLMs (AA TechTalk)

Dec

Tuesday

Tuesday, December 2nd, 2025 • 8PM to 9PM (HKT)

Virtual Meeting

Event Ended

This event has already taken place.

Attendees 18+ registered

Attendees include CEOs, CTOs, and AI Engineers from LexisNexis and Pyramid AI, featuring an Alibaba Cloud MVP, with Python and AI/ML as top skills.

🤖 🤖 Join Us for an Interactive Online Discussion — Benchmarking LLMs for application use 🤖 🤖

Register to join an interactive and practical session on one of the biggest questions facing builders today: How do we choose the right LLM for our application? With new models dropping every week and public benchmarks rarely reflecting real-world behaviour, we’ll dig into how practitioners actually test, evaluate, and compare models for their own use cases. Expect a highly technical, community-driven conversation for builders in Hong Kong and the Greater Bay Area.

About AI Tinkerers Hong Kong and Greater Bay Area

AI Tinkerers is a community for individuals building, tinkering, and working with all things AI. It’s designed for practitioners with technical, machine learning, and entrepreneurial backgrounds who are actively building and working with foundation models, such as large language models (LLMs) and generative AI. The purpose is to promote sharing, facilitate learning and collaborations, and support each other in innovating in the rapidly evolving AI space.

Who is this for and What to Expect:

This virtual meetup is tailored for individuals with technical, machine learning, or entrepreneurial backgrounds who are actively building with foundation models like LLMs and generative AI. If you’re passionate about creating AI-driven applications, exploring new use cases, or have hands-on experience with automation systems, this is for you!

Session Focus: Benchmarking LLMs for Application Use

Think of this session as a “practical benchmarking roundtable” for active builders.

How do you evaluate models when benchmarks don’t match your workload?
What testing setups, prompts, and metrics do real teams use?
How are people comparing outputs across models quickly, cheaply, and reliably?
And when do you stop experimenting and lock in a model for production?

Bring your experiments, your results, your failures, and your questions — we want to hear what’s working and what isn’t.

Why this topic?

Because everyone’s feeling the same pain: the pace of model releases is insane, public benchmarks don’t reflect production behaviour, and every builder ends up reinventing the same evaluation processes in isolation.

Our community asked for a deep dive into:

how practitioners are choosing between models,
how they validate quality beyond vibes,
and how to avoid spending weeks manually eyeballing model outputs.

Whether you’re building an agent system, a RAG pipeline, a customer-facing chatbot, or a backend automation service, good model evaluation is becoming a core skill. This session is a chance to trade notes, compare approaches, and walk away with concrete techniques you can apply immediately.

Sponsors, Partners, Volunteers

Chapter Founder & Lead Organizer: Timothy

Current/Previous Co-Organizers & Volunteers: Nikhil, Lawrence, Ian, Adam, Iulian, Augustin, William, Anne, Ivan, Yanan, Andy, Ali G, Jeremy, Alexander, Shahman

This AATechTalk is Hosted by: Augustin, Andy and Alexander

🤖 Connect, Collaborate, and Create 🤖

This virtual meetup is your chance to:

Network with fellow tinkerers: Meet and collaborate with other active builders, AI engineers, entrepreneurs, and researchers who share your passion for building the future with AI.
Learn from industry experts: Gain insights and inspiration from leading figures in the AI space.

RSVP Today!

Space is limited, so don’t miss out on this incredible opportunity to connect with the Hong Kong AI community. RSVP now.

NOTE: We are adding a “classification” so all our Tinkerers know how technical events are going to be. This way you know exactly what you’re signing up for.

Level 4: Interactive Circuit 🎮

Find out more about our classification system here!