A quick story from the trenches that might help someone else someday. My game was stuttering in a certain situation: a collision between two objects caused a great big particle explosion, the deallocation of several objects, and a little score number floating up like an angel out of the mess. I'm too lazy to learn how to use XCode's profiling tools, so I went with my assumption that it was because I was removing sprites from CCSpriteSheet (which the API says is slow). Thus, I wrote a wonderful caching system to recycle sprites and eagerly fired it up on my slowest test device. It still stuttered.
I began commenting out parts of the code and discovered the stutter went away when I removed a line creating a CCBitmapFontAtlas (the floating score). I guessed that creation of the CCBitmapFontAtlas was slow, so I tried recycling them, too, just doing a setString on existing ones when I knew what the specific score was for the collision event. Nope, still stuttered. I tried backporting r1742 (use ccHashSet) to my code. Nope. Removed color-cycling. Nope. Warmed up the cache by creating a few "0123456789" strings. Nope. Finally I tried CCLabel, which I'd thought was totally evil from reading the documentation and this forum. What do you know: it worked!
I'm totally speculating here, but it seems like CCLabel might be slow in a CPU-consumption sense, but CCBitmapFontAtlas might be doing something bad at another level (OpenGL? Texture cache? Flux capacitor?). It consistently sucks up around 100msec in my game on my iPhone 2G.
So the moral of my story is (1) learn how to use the profiler so you don't waste time optimizing the wrong things, or at least test theories by narrowing down the suspect code, and (2) sometimes CCLabel is better than CCBitmapFontAtlas.
This might be obvious to some of you, and if so, I'd appreciate hearing more details.