Visual Language Action Instruction Tuning