Lexer 1. Read a File; Split the lines using the System.lineSeparator (enter) 2. For each line read character by character and use the character as an input for a Deterministic Finite Automata 3. Concatenate the character, creating the largest STRING possible. Stop when a delimiter, white space, operator, or quotation mark and the current state allowed. If there are more characters in the line, create a new line with those characters and go to step 2. 4. For each WORD report its TOKEN. Report ERROR as a token value for STRINGs, i.e., (wrong items)
a DFA B,b 0 1 .. . Delimiter, operator, whitespace, quotation mark S0 SE S1 SE SE Stop S1 S2 SE SE SE Stop S2 SE S3 S3 SE Stop S3 SE S3 S3 SE Stop SE SE SE SE SE Stop
B,b 0 1 .. . Delimiter, operator, whitespace, quotation mark S0 SE S1 SE SE Stop S1 S2 SE SE SE Stop S2 SE S3 S3 SE Stop S3 SE S3 S3 SE Stop SE SE SE SE SE Stop
Initialize currentState to “s0” 2. Create an empty string to store words, 3. Index to track position in the input line. 4. Loop through each character in the input line. ▪ if (the character is not an operator, delimiter, or space) { - get the next state from the DFA, - append the character to the current token, - and update currentState. ▪ } otherwise { - If currentState is an accepting state, store the token with its state name; Otherwise, store it as an error. - If the currentCharacter is an operator, store it as an “OPERATOR” token; - if the currentCharacter is a delimiter, store it as a “DELIMITER” token. - Reset currentState to “s0” and clear the string storage. } 5. After processing all characters, check the last string/word and store it accordingly.
slides can only be used as study material for the Compilers course at Universidad Panamericana. They cannot be distributed or used for another purpose.