We have an action framework (not cocos yet), however instead of cloning the action, we use a context object, which holds data relevant only to a running action. context might contain, current angle and the object(s) we are animating, time, etc. Context is created by the dispatcher by calling an interface on the action, then while the action runs the dispatcher is responsible for passing in the context, so no instance values are modified and there is no need for cloning/copying.
This does require a context object for actions that need to track state while the action is running, however when consuming the action the client is unaware of context, with the added benefit of not having to clone/copy actions.
We also have the concept of an action trees, which behaves more like a binary tree with next and child properties instead of collections, basically this means fewer animation classes to create more complex animation.
Going to be writing this as we port, but I wanted to share and also get feedback about any issues that may arise with this approach.