Changelog
All notable changes to microbench are documented here.
[2.1.0] - 2026-04-26
New features
-
MBResourceUsagemixin: captures POSIXgetrusage()data — user and system CPU time, peak RSS (in bytes, normalised across platforms), minor and major page faults, block I/O operations, and voluntary/involuntary context switches. Results are stored as a list inresource_usage, one entry per timed iteration, aligned index-for-index withcall.durationsandcall.returncode. Added as a default CLI mixin so every CLI run captures it automatically. -
CLI mode: on POSIX, uses
os.wait4()to capture the exact rusage of each individual child process as reported by the kernel, including a reliablemaxrssper iteration regardless of--iterationsor--warmupcount. - Python API mode: uses
RUSAGE_SELFwith a before/after delta around each individual call — one entry per timed iteration, aligned withcall.durations. Warmup calls are excluded.maxrssis omitted (lifetime process HWM, not per-call). UseMBPeakMemoryfor per-call peak memory.
On platforms where the stdlib resource module is unavailable, the
resource_usage key is omitted from the record entirely.
Enhancements
-
--mixin defaultskeyword (CLI):defaultscan be used as a mixin name to expand to the standard default set (python-info,host-info,slurm-info,loaded-modules,working-dir,resource-usage). This makes it easy to add one or more extra mixins without listing all six defaults explicitly:microbench --mixin defaults file-hash -- ./job.sh. -
file-hashmixin — automatic argument file scanning (CLI): the default hash list now includes not only the command executable (cmd[0]) but also any command-line arguments (cmd[1:]) that resolve to existing files on disk prior to command execution. Passing--hash-filestill overrides the default entirely; the Python API is unaffected. The hash algorithm name is now stored undermb.file_hash_algorithm.
Bug fixes
MBResourceUsage—pre_run_triggers/post_run_triggersnow forward viasuper(): composingMBResourceUsagewith another mixin that also implementspre_run_triggersorpost_run_triggerspreviously caused the second mixin's hooks to be silently skipped. Both methods now propagate correctly through the MRO.
Documentation
-
cli_compatibleclass attribute removed from built-in mixins: this attribute was never read at runtime — CLI availability is governed solely by theMIXIN_REGISTRYincli/registry.py. The extending guide example has been updated to drop it; custom mixins that set it can safely remove it without any behavioural change. -
Fix documentation on writing custom mixins to note that they must be added to the registry if they are to be detected by the CLI.
[2.0.0] - 2026-03-17
Microbench v2 is a significant upgrade with many new features versus v1.1.0. Be sure to review the breaking changes before upgrading.
New features
-
Command-line interface (
microbench -- COMMAND): wrap any external command and record host metadata alongside timing without writing Python code. Useful for SLURM jobs, shell scripts, and compiled executables.- Records
command,returncode(list, one per timed iteration), alongside the standard timing fields. Mixins are specified by short kebab-case names without theMBprefix (e.g.host-info); original MB-prefixed names are also accepted. - Use
--mixin MIXIN [MIXIN ...]to select metadata to capture (defaults tohost-info,slurm-info,loaded-modules,python-infoandworking-dir) - Use
--show-mixinsto list all available mixins with descriptions; use--field KEY=VALUEto attach extra labels - Use
--iterations Nand--warmup Nfor repeat timing - Use
--stdout[=suppress]and--stderr[=suppress]to capture subprocess output into the record (output is re-printed to the terminal unless=suppressis given) - Use
--monitor-interval SECONDSto sample child process CPU and memory over time. - Some mixins expose
their own configuration flags (shown in
--show-mixinsand--help) - Capture failures are non-fatal by default
(
capture_optional = True), making the CLI safe across heterogeneous cluster nodes. - The process exits with the first non-zero returncode seen across all timed iterations if present, or zero (success) otherwise.
- Records
-
summary(results)/bench.summary(): prints min / mean / median / max / stdev ofcall.durationsacross all results. No dependencies required beyond the Python standard library.bench.summary()is a one-liner convenience that callsbench.get_results()internally. The module-levelsummary(results)accepts any list of dicts and can be composed with other results-processing steps.
from microbench import MicroBench, summary
bench = MicroBench()
@bench
def my_function():
...
for _ in range(10):
my_function()
bench.summary()
# n=10 min=0.000031 mean=0.000038 median=0.000036 max=0.000059 stdev=0.000008
# or with explicit results list:
summary(bench.get_results())
bench.time(name)sub-timing: [Python API] label phases inside a single benchmark record with named timing sections. Sub-timings accumulate incall.timingsas[{"name": ..., "duration": ...}, ...]in call order. Compatible withbench.record(),bench.arecord(),@bench(sync and async), andbench.record_on_exit(). Calling outside an active benchmark is a silent no-op;call.timingsis absent whenbench.time()is never called.
with bench.record('pipeline'):
with bench.time('parse'):
data = parse(raw)
with bench.time('transform'):
result = transform(data)
- Async support: [Python API] the
@benchdecorator now detectsasync deffunctions and returns anasync defwrapper that must be awaited. A newbench.arecord(name)method provides the async counterpart ofbench.record()for use withasync with. All mixins, static fields, output sinks,iterations, andwarmupwork identically to the sync path.MBLineProfilerraisesNotImplementedErrorat decoration time when used with an async function (line profiling of coroutines is not supported).
@bench
async def fetch():
await asyncio.sleep(0.01)
asyncio.run(fetch())
async with bench.arecord('load'):
await load_data()
Note: elapsed time includes event-loop interleaving from other concurrent tasks; run in an otherwise-idle event loop for repeatable results.
bench.record_on_exit(name, handle_sigterm=True): [Python API] registers a process-exit handler that writes one benchmark record when the script terminates. Captures wall-clock duration from the call site to exit plus all mixin fields. Designed for SLURM jobs and batch scripts where restructuring code around a decorator is impractical. Key behaviours:- By default installs a SIGTERM handler (main thread only) that writes the record, chains to any existing SIGTERM handler, then re-delivers SIGTERM so the process exits with the conventional code 143 (128 + 15).
- Wraps
sys.excepthookto capture unhandled exceptions into anexceptionfield before the process exits. - Adds an
exit_signalfield when the exit was triggered by SIGTERM. - Falls back to writing the record to
sys.stderrif the primary output sink raises (e.g. filesystem unmounted at exit time). -
Calling a second time on the same instance replaces the first registration and resets the start time.
-
bench.record(name)context manager: [Python API] times an arbitrary code block and writes one record, without requiring the code to be in a named function. All mixins, static fields, and output sinks behave identically to the decorator form. -
Exception capture: [Python API] when a benchmarked block raises — via
bench.record()or a@bench-decorated function — the record is written before the exception propagates. Anexceptionfield is added containing{"type": ..., "message": ...}. The exception is always re-raised. With--iterations N, timing stops at the first exception. -
MBPythonInfomixin replacesMBPythonVersion: records apythondict withversion,prefix(sys.prefix), andexecutable(sys.executable), giving a complete picture of the running interpreter in one field.MBPythonVersionhas been removed (see breaking changes above).MBPythonInfois included inMicroBenchby default (Python API) and in the CLI default mixin set;--no-mixinsuppresses it on the CLI as usual. -
MBLoadedModulesmixin: captures the loaded Lmod / Environment Modules software stack into aloaded_modulesdict mapping module name to version string (e.g.{"gcc": "12.2.0", "openmpi": "4.1.5"}). Reads the standardLOADEDMODULESenvironment variable — no subprocess, no extra dependencies. Empty dict when no modules are loaded. Included in the CLI defaults alongsideMBHostInfo,MBSlurmInfo, andMBWorkingDir. -
MBWorkingDirmixin: captures the absolute path of the working directory at benchmark time intocall.working_dir. No dependencies. Included in the CLI defaults — useful for reproducibility when comparing results across nodes or directories. -
MBGitInfomixin: captures the repository root path, current commit hash, branch name, and dirty flag (uncommitted changes present) viagit≥ 2.11 on PATH. Stored ingit. Setgit_repoto inspect a specific repository directory. -
MBPeakMemorymixin: captures peak Python memory allocation during the benchmarked function ascall.peak_memory_bytes(bytes), usingtracemallocfrom the standard library. No extra dependencies required. -
MBSlurmInfomixin: captures allSLURM_*environment variables into aslurmdict (keys lowercased,SLURM_prefix stripped). Empty dict when running outside a SLURM job. Supersedes the manualenv_vars = ('SLURM_JOB_ID', ...)pattern. -
MBFileHashmixin: records a cryptographic checksum of specified files in thefile_hashesfield (a dict mapping path to hex digest). Defaults to hashingsys.argv[0]— the running script. Sethash_filesto an iterable of paths to hash specific files instead. Sethash_algorithmto any algorithm accepted byhashlib.new(default:'sha256'). -
MBCgroupLimitsmixin: captures the CPU quota and memory limit enforced by the Linux cgroup filesystem. Works for SLURM jobs and Kubernetes pods (cgroup v1 and v2). Fields incgroups:cpu_cores_limit(float — quota ÷ period, ornullif unlimited),memory_bytes_limit(int ornullif unlimited),version(1 or 2). Returns{}on non-Linux systems or when the cgroup filesystem is unavailable.
MBCondaPackagesimprovements:- Queries the environment identified by
CONDA_PREFIX(the shell's active conda environment) rather thansys.prefix. Falls back tosys.prefixwhenCONDA_PREFIXis not set. - Falls back to
CONDA_EXEifcondais not onPATH(common in non-interactive SLURM batch scripts where conda is activated but itsbin/is not onPATH). -
Replaces the separate
conda_versionsfield with a unifiedcondadict containingname(CONDA_DEFAULT_ENV),path(CONDA_PREFIX), andpackages(the version dict). Either ofname/pathmay beNoneif the corresponding variable is unset. Withget_results(flat=True)these expand toconda.name,conda.path,conda.packages.<pkg>etc. -
MicroBenchBase: the core benchmarking machinery is now exposed asMicroBenchBase(no default mixins).MicroBenchinherits from bothMicroBenchBaseandMBPythonInfo. SubclassMicroBenchBasedirectly when you need a completely bare benchmark class with no automatic captures. -
warmupparameter: passwarmup=Nto run the functionNtimes before timing begins, priming caches or JIT compilation without affecting results. Warmup calls are unrecorded and do not interact with the monitor thread or capture triggers. -
Multi-sink output architecture (#52): Results can now be written to multiple destinations simultaneously by passing an
outputslist toMicroBench. Three classes make up the new output API: Output— abstract base class; subclass this to implement custom sinks.FileOutput— writes JSONL to a file path or file-like object (wraps the previous default behaviour).RedisOutput— writes to a Redis list.HttpOutput- New for v2 - POST each benchmark result to an HTTP/HTTPS endpoint.
The existing outfile parameter and class-level outfile attribute continue
to work as shorthand for a single FileOutput. Passing both outfile and
outputs raises ValueError.
Example — write to a file and Redis simultaneously:
from microbench import MicroBench, FileOutput, RedisOutput
bench = MicroBench(outputs=[
FileOutput('/home/user/results.jsonl'),
RedisOutput('microbench:mykey', host='redis-host', port=6379),
])
get_results() delegates to the first sink that supports reading back
results (FileOutput and RedisOutput both do).
get_results(format=..., flat=...):get_resultsnow accepts two keyword arguments.format='dict'(default) — returns a list of dicts; no pandas required.format='df'— returns a pandas DataFrame (previous default behaviour).-
flat=True— flattens nested dict fields (e.g.slurm,cgroups,git) into dot-notation keys (slurm.job_id,call.name). Works for both formats without requiring pandas. -
capture_optionalclass attribute: setcapture_optional = Trueon a benchmark class to catch exceptions fromcapture_andcapturepost_methods instead of aborting the benchmark call. Failures are recorded incall.capture_errors(a list of{"method": ..., "error": ...}dicts); the field is absent when all captures succeed. Designed for production jobs on heterogeneous cluster nodes where optional dependencies may not be present on every node. -
python-dateutildependency removed fromLiveStream: timestamp parsing now usesdatetime.fromisoformat()from the standard library. Removepython-dateutilfrom your environment if it was only installed for microbench.
Breaking changes (vs v1.1.0)
- Removed deprecated mixins:
MBPythonVersion,MBHostCpuCores, andMBHostRamTotalhave been removed. - Replace
MBPythonVersionwithMBPythonInfo(capturespython.version,python.prefix,python.executable). -
Replace
MBHostCpuCoresand/orMBHostRamTotalwithMBHostInfo, which now captureshost.cpu_cores_logical,host.cpu_cores_physical, andhost.ram_totalautomatically when psutil is installed. -
Namespace restructuring of record fields: All benchmark record fields are now grouped into top-level namespace dicts, making records self-documenting and easier to query. The complete rename table is below.
Core fields move into mb (static config) and call (per-call data):
| Old key | New key |
|---|---|
mb_run_id |
mb.run_id |
mb_version |
mb.version |
timestamp_tz |
mb.timezone |
duration_counter |
mb.duration_counter |
start_time |
call.start_time |
finish_time |
call.finish_time |
run_durations |
call.durations |
function_name |
call.name |
telemetry |
call.monitor |
env_* (flat keys) |
env.* (dict) |
package_versions (from capture_versions) |
python.loaded_packages |
Mixin fields move to typed namespaces:
| Old key | New key |
|---|---|
hostname |
host.hostname |
operating_system |
host.os |
cpu_cores_physical |
host.cpu_cores_physical |
ram_total |
host.ram_total |
args |
call.args |
kwargs |
call.kwargs |
return_value |
call.return_value |
line_profiler |
call.line_profiler |
package_versions (MBGlobalPackages) |
python.loaded_packages |
package_versions (MBInstalledPackages) |
python.installed_packages |
package_paths |
python.installed_package_paths |
nvidia_<attr> (multiple flat dicts) |
nvidia (list of per-GPU dicts with uuid key) |
Unchanged namespaces: conda.
Migration: Use get_results(flat=True) to access fields via dot-notation
keys (call.name, mb.run_id, host.hostname, etc.) in pandas or scripts
without rewriting nested dict access. Alternatively, update field access to
the new nested structure: result['call']['name'], result['mb']['run_id'],
result['host']['hostname'], etc.
-
get_results()now returns a list of dicts by default: The defaultformat='dict'returns a list of plain Python dicts and requires no dependencies. Passformat='df'to get a pandas DataFrame (previous behaviour). Update existing callers:bench.get_results()→bench.get_results(format='df'). -
telemetryrenamed tomonitor(#51): The background sampling thread has been renamed throughout the API to better reflect its intent (continuous monitoring, not data transmission). TelemetryThread→MonitorThread- Class variable
telemetry_interval→monitor_interval - Class variable
telemetry_timeout→monitor_timeout - Result field
bm_data['telemetry']→bm_data['monitor'] -
Internal attribute
self._telemetry_thread→self._monitor_thread -
MicroBenchRedisremoved (#52): UseMicroBench(outputs=[RedisOutput(...)])instead.
Before:
from microbench import MicroBenchRedis
class RedisBench(MicroBenchRedis):
redis_connection = {'host': 'localhost', 'port': 6379}
redis_key = 'microbench:mykey'
bench = RedisBench()
After:
from microbench import MicroBench, RedisOutput
bench = MicroBench(outputs=[RedisOutput('microbench:mykey',
host='localhost', port=6379)])
LiveStreamupdated for v2 record schema: field references updated from the v1 flat schema (function_name,hostname,start_time,finish_time) to the v2 nested schema (call.name,host.hostname,call.start_time,call.finish_time). Records produced by microbench v1 are no longer parsed correctly byLiveStream; this is expected given the v2 schema migration documented in the breaking changes section above.
[1.1.0] - 2026-03-13
New features
mb_run_idandmb_versionfields added to every record (#53): Both fields are included automatically without any configuration.mb_run_id— UUID generated once at import time and shared by allMicroBenchinstances in the same process. Allows records from independent bench suites to be correlated withgroupby('mb_run_id').mb_version— version of themicrobenchpackage that produced the record; essential for long-running studies where the benchmark code evolves.