Multi-box models describing the evolution of cooking-generated aerosols emitted inside an urban street canyon are developed. By contrast with previous box models of urban pollutants, multiple boxes are introduced by partitioning the canyon space into dynamically distinct boxes. Aerosol dynamical processes, namely coagulation and deposition, are represented through standard parameterisations; the exchange or ventilation timescale between boxes is specified by the mean residence time, which is obtained from a Lagrangian particle model. Comparison with predictions from a large-eddy simulation model with a coupled sectional aerosol model indicates improved agreement when nine boxes are used in place of a single box; for example, the relative error in the canyon-averaged number concentration for deep-frying emissions decreases from ∼30% to ∼5%. The improvement is smaller for boiling emissions, which contain fewer small particles. The inclusion of extra boxes eliminates the need for explicit segregation corrections and enables the size spectrum and coarse-grained spatial structure of the aerosol number concentration to be captured.