QTRB: TEAM-BASED REGION BUILDING USING Q-LEARNING TO DERIVE POLICY ON PROGRAMS PARAMETERIZED BY LOCAL REWARD SIGNAL