with different length to integers using a dictionary (vocab.json) and padded to max length in the batch. [‘G’,‘E’,‘T’, … ,‘%’,‘1’] [‘P’,‘O’,‘S’,‘T’, … , ‘d’,‘3’,‘0’] ... batch [‘P’,‘U’,‘T’, … , ‘-’,‘-’,‘1’] [‘G’,‘E’,‘T’, … , ‘<’,‘a’,‘p’,‘i’,‘>’] batch ... [71,69,84, … ,37,49,0,0 ] [80,79,83, … , 100,53,48] ... batch [80,85,84, … , 0,0,0,0,0,0,0] [71,69,84, … , 5,69,78,65,8 ] batch char to int + batch padding raw char sequence ...