Resolve VSN-1986 "Feature/improve route optimization with direction-aware reward calculation"
- Add direction-aware reward calculation in A2COptimizer
- Implement _get_destinations() and _calculate_reward() methods
- Re-optimize passenger order after trip merging in TripOptimizer
- Fix RoutePlanner._optimize_route as static method with correct A2C params
- Adapt starting position based on trip direction (departure/return) Closes VSN-1986