Ada 2012 Tutorial #2

Ada 2012 Tutorial
Parsing (and Streams)

Ada Lovelace, the namesake of the Ada programming language, considered the world’s first computer programmer

We left off the previous tutorial at parsing input from a user or a file, so we’re going to address that today. First, however, I need to introduce Streams.

Streams are a method to read or write any object to any medium, and thus they are doubly generalized. This also means that you are bounded by the most restrictive set of operations common to all mediums. As an example, you cannot provide position control in a general manner because not all transmission modes are random-access (like receiving a radio-signal), and not all streams are bi-directional (like a light-sensor).

In the informal parlance we’ve adopted we can just sat that all types have stream attributes, accessed with ‘Read and ‘Write, because all elementary types have them and the compiler knows how to compose compound types from elementary types, so you don’t normally have to keep track of elements in a compound type. (You do have to keep track of them if you’re writing both read and write that must be be functionally, rather than perfectly, inverse-operations; this is not a deficiency, but because you are implementing a protocol.)

So let’s see how to do it.

Here’s the spec for lists with the appropriate modifications.

lisp-lists.ads
With
LISP.Elements,
Ada.Streams.Stream_IO,
Ada.Containers.Indefinite_Vectors;

Use All Type LISP.Elements.Element;

Package LISP.Lists is

    -- The following package, "implementation", is the
    -- instantiation of a generic package; a generic must be
    -- explicitly instantiated before its use in your program
    -- text.
    Package Implementation is new Ada.Containers.Indefinite_Vectors
      (Index_Type => Positive,  Element_Type => LISP.Elements.Element);
    -- PS: The above is using named association, which can be helpful.

    -- The following line declares inheritance, with no
    -- extension to the data of the base object/class; there
    -- are, however, new 'methods' which are added below.
    Type LIST is new Implementation.Vector with null record
	with Read => Read, Write => Write;
    -- For some reason, the subtype-rename causes an error on
    -- visibility; therefore, the type-extension above is used.
    --     SubType List is Implementation.Vector;

    -- Primitive operations are those subprograms which
    -- interact with a type and are declared before the type is
    -- frozen. (Read as: in the same declaring unit, before
    -- some point which the compiler must commit to a type.)
    --
    -- The following are primitive operations for LIST.
    Function Evaluate(Input : List) Return List;
    Function Head(Input : List) Return LISP.Elements.Element;
    Function Tail(Input : List) Return List;
    Function To_String  (Input : List)   Return String;
    Function Homoginized_List( Input : List ) Return Boolean;

Private
    Use Implementation, Ada.Streams;

    -- Read and Write are ptivate because the user can access
    -- them via LIST'Read and LIST'Write stream operations.

    procedure Read(
		    Stream : not null access Root_Stream_Type'Class;
		    Item   : out LIST
		  );

    procedure Write(
		    Stream : not null access Root_Stream_Type'Class;
		    Item   : in LIST
		  );

    -- The two following comment lines are the old method of
    -- specifying custom subprograms be used for stream
    -- interaction:

    -- For LIST'Read  Use Read;
    -- For List'Write Use Write;

End Lisp.Lists;

Then, in the body, we add the following lines:

Portions of lisp-lists.adp
    -- Write is simple: wirte out the string representation of
    -- the list to the stream.
    procedure Write(
            Stream : not null access Root_Stream_Type'Class;
            Item   : in LIST
           )  is
    begin
        String'Write( Stream, Item.To_String );
    end Write;

    -- Because reading is itself more complicated we will
    -- make the function itself SEPARATE.
    procedure Read(
            Stream : not null access Root_Stream_Type'Class;
            Item   : out LIST
          ) is separate;

Here is the completion-file for the separate. As you can see it’s quite a bit more complicated than the write-procedure, I trust the comments will explain the reasoning behind this.

lisp-lists-read.adb
With
LISP.Strings,
Ada.Strings.Fixed;

Separate (LISP.Lists)

-- Read is perhaps overly complicated for a parsing example,
-- this is unavoidable given the constraints we are working
-- with in choosing to use strings AND to read/save directly in
-- valid LISP-text formant; namely:
--    (A) Generalized IO,
--        (1) No look-ahead is possible,
--        (2) No "unreading" is possible,
--    (B) LISP-text
--        (1) No way to tell, in advance, the length of
--            a list or element thereof.
--        (2) No way to tell, in advance, the type of an
--            element.
-- Therefore, the solution below is presented in a rather step-
-- wise manner: read the list from the stream, use that string
-- as the input of a string-to-list subprogram [consisting of
-- element recognition and list-construction]. Yes, this all
-- *COULD* [and should] be done in a single pass; that however,
-- might be too much for a set of "intro into the language"
-- tutorials. (Though arguably an interpreter is, too.)
--
-- Lastly, it would be remiss of me to fail to explain the
-- advantages in the above after detailing the disadvantages:
--    (A) Generalized IO
--        (1) Because it is general, a stream may be
--            generated by anything from a disk-drive
--            to a radio-receiver to a network itself.
--        (2) Because is is general, this method allows
--            us to swap out bodies for read, so we
--            can test multiple implementations of a
--            stream-parser AND different file-formats.
--    (B) LISP Program-Text
--        (1) There need not be differing functions for
--            reading in source-code and reading in a
--            line typed by the user -- both may use
--            the same stream-functionality.
--        (2) Output is absurdly simple -- just pipe all
--            the list's elements's To_String output and
--            make sure to output the '(' & ')' for
--            (respectively) before and after writing.

procedure Read(
        Stream : not null access Root_Stream_Type'Class;
        Item   : out LIST
          ) is
    Use ASCII, Ada.Characters.Handling, Ada.Streams.Stream_IO, LISP.Strings;

    -- The actual ASCII Whitespace characters.
    Subtype Whitespace is Character Range NUL..' '
    with Static_Predicate => Whitespace in NUL | HT..CR |  ' ';

    -- The following function gets the next non white-space character.
    -- Used to "prime the pump" of the stream, as any leading
    -- whitespace should be disregarded.
    function Get_Next_Character Return Character is
    begin
        Return Result : Character:= ASCII.NUL do
            loop
            Character'Read( Stream, Result );
            Exit When Result not in Whitespace;
            end loop;
        End return;

        Exception
        When End_Error => Raise Parse_Error;
    end Get_Next_Character;

    ---------------------
    -- STATE VARIABLES --
    ---------------------
    -- Look_Ahead contains the imminent character from the stream.
    -- Current_Character, the working character.
    -- Stream_End marks whether the stream has been exhausted.
    Look_Ahead,
    Current_Character    : Character:= ASCII.NUL;
    Stream_End        : Boolean:= False;

    -- Advance shifts the old look_ahead into current_character
    -- and reads in a new character from the stream. In case of
    -- an empty stream a flag is set and NUL is substituted.
    -- The END_ERROR exception propagates if the flag is set.
    procedure Advance is
    begin
        Current_Character:= Look_Ahead;
        Character'Read( Stream, Look_Ahead );
        Stream_End:= False; -- Reset the flag.
    exception
        when END_ERROR =>
            if not Stream_End then
                Stream_End:= True;
                Look_Ahead:= ASCII.NUL;
            else
                Raise END_ERROR;
            end if;
    end Advance;

    -- Build_String reads the characters from the stream, and
    -- returns the resultant string.
    Function Build_String Return String is

        Function Advance_Build(
               Open_Parens : Natural;
               In_String   : Boolean
             ) return String is
            OP : Constant Character:= '(';
            CP : Constant Character:= ')';
            C  : Constant Character:= Current_Character;

            -- Active_Paren handles the increment and decriment
            -- of the Open_Paren being passed to Build_String.
            Function Active_Paren Return Natural is
              ( if In_String then Open_Parens
                elsif C = CP then Open_Parens - 1
                elsif C = OP then Open_Parens + 1
                else Open_Parens -- Any other character
              );

            -- Inside_String is merely a shorthand for
            -- determining when we enter/exit a string.
            Function Inside_String Return Boolean is
              (if C = '"' then Not In_String else In_String);

        begin
            if Active_Paren = 0 then
                return "";
            end if;

            Advance;
            return C & Advance_Build(
                    Open_Parens => Active_Paren,
                    In_String   => Inside_String
                );
        exception
            When End_Error => return "";
        end Advance_Build;

    begin
        return Advance_Build(0, False);
    end Build_String;

begin
    -- Here we "prime the pump" for our stream processing.
    Look_Ahead:= Get_Next_Character;

    -- Ensure we are starting a new list.
    if Look_Ahead /= '(' then
        raise Parse_Error;
    end if;

    -- We advance once to move '(' to the current position.
    Advance;

    Declare
        -- String_Value holds a List's string representation.
        String_Value : Constant String:= Build_String;

        -- Parse_String returns the List from the string which
        -- has been passed as the input.
        Function Parse_String( Input : String:= String_Value ) Return List is

            Function Is_Numeric(S:String) Return Boolean Renames
                LISP.Strings.Valid_Number;

            Function Element_Type( Input : String ) Return Data_Type is
                Initial : Constant Character := Input(Input'First);
            begin
                if Initial = '"' then
                    Return String_Type;
                elsif Initial = '(' then
                    Return List_Type;
                elsif To_Upper(Input) = "NULL" then
                    Return Empty_Type;
                elsif Is_Numeric(Input) then
                    -- If we add in floating types we'll need
                    -- to differentiate between integers and
                    -- the floating-type here.
                    Return Integer_Type;
                elsif Valid_ID(Input) then
                    Return Name_Type;
                end if;
                -- Since the above covers all possible values
                -- for Data_Types, if we made it here it
                -- indicates something went wrong in parsing.
                Raise PARSE_ERROR;
            end Element_Type;

        subtype Internal_Indecies is Positive Range
          Positive'succ(Input'First)..Input'Last;

        -- Scan is a procedure that, taking some index,
        -- returns the next index of interest -- there are
        -- two such functions: one for getting the start of
        -- a new element, and one for getting the end of it.
        Procedure Scan( Index : in out Internal_Indecies ) is
        begin
            while Input(Index) in Whitespace
              and Internal_Indecies'Last > Index loop
                Index:= Internal_Indecies'Succ(Index);
            end loop;
        end Scan;

        Function Scan( Old_Index : in  Internal_Indecies;
                       New_Index : out Positive ) Return LISP.Elements.Element is

            Start,
            Stop      : Internal_Indecies:= Old_Index;

            Function This Return Character is
              ( Input(Stop) );
            Function Element Return String is
              ( Input(Start..Stop) );

            Function Is_List  Return Boolean is
              (Input(Start) = '(');
            Function Is_String Return Boolean is
              (Input(Start) = '"');

        begin
            -- Because the element separators ", " and " "
            -- are equivalent, we will strip out any initial
            -- comma found and proceed as if it were whitespace.
            if Input(Start) = ',' then
                Start:= Start + 1;
            end if;

            -- Consume leading whitespace, because we might
            -- have something like ( Bobby , Dave, ,Jo).
            Scan(Start);
            Stop:= Start;

            if Is_List then
                -- if it is a list we need to balance parentheses.
                declare
                    Paren_Count   : Natural:= 1;
                    Inside_String : Boolean:= False;
                begin
                    -- We don't want our start-index to
                    -- terminate the loop prematurely.
                    loop
                        Stop:= Internal_Indecies'Succ(Stop);

                        case this is
                        when '(' =>
                            if not Inside_String then
                                Paren_Count:= Natural'Succ(Paren_Count);
                            end if;
                        when ')' =>
                            if not Inside_String then
                                Paren_Count:= Natural'Pred(Paren_Count);
                            end if;
                        when '"' =>
                            Inside_String:= not Inside_String;
                        when others => null;
                        end case;

                        exit when Paren_Count not in Positive
                               or Stop = Internal_Indecies'Last;
                    end loop;

                    if this /= ')' then
                        Raise PARSE_ERROR;
                    end if;
                end;
            else
                -- If element is a string, then we need to
                -- search for the closing quote, otherwise
                -- we need to search for the next character
                -- delimiting items [whitespace | ','].
                while ( if Is_String then (if stop > start then this = '"')
                        else this not in Whitespace | ',' ) loop
                    Stop:= Stop + 1;
                end loop;

                -- Now we correct for consuming delimiters,
                -- or not consuming them (in string's case),
                -- as we certainly don't want to have
                -- "BOBBY " or "BOBBY," as an function-name
                -- or leave the closing quote on our index.
                Stop:= (if Is_String then Stop + 1 else Stop - 1);
            end if;

            -- Set the output-parameter.
            New_Index:= 1 + Stop;

            -- Yes, we already know whether or not the
            -- element is a string; however, using a case-
            -- statement [without OTHERS] ensures that all
            -- choices are covered... so if we change the
            -- what types the interpreter handles, we will
            -- receive a compiler error informing us of it.
            case Element_Type(Element) is
            when Empty_Type   =>
                Return Create;
            when Integer_Type =>
                declare
                    Value : Integer:= Integer'Value( Element );
                begin
                    Return Create(Value);
                end;
            when Name_Type | String_Type   =>
                -- Strings and identifiers are both strings,
                -- the difference is that one is surrounded
                -- by quotes, the other not; the following
                -- conditional statement handles that for
                -- our call to Create.
                Return Create(
                        Item => (if Is_String
                                 then Get_Internals(Element)
                                 else Element),
                        Identifier => not Is_String
                    );
            when List_Type    =>
                -- Recursion, Oy!
                Return Create( Parse_String(Element) );
            end case;
        end Scan;

        Index : Positive:= Internal_Indecies'First;

    -------------------------
    --  PARSE_STRING BODY  --
    -------------------------
    begin
        Return Result: List do
            PARSE_ELEMENTS:
            loop
                Scan( Index );
                declare
                    Working : LISP.Elements.Element
                                Renames Scan( Index, Index );
                    -- Yes, we could get rid of this whole
                    -- declare-block by saying:
                    --    Result.Append( Scan( Index, Index ) );
                begin
                    Result.Append( Working );
                    exit PARSE_ELEMENTS when
                          Index not in Internal_Indecies'First..
                                Internal_Indecies'Pred(Internal_Indecies'Last);
                end;
            end loop PARSE_ELEMENTS;
        End return;
    end Parse_String;

    begin
        if String_Value'Length < 2 then raise PARSE_ERROR; end if;
        Item:= Parse_String;
    end;
end Read;

Then finally modify your Testbed.adb to use the ‘Read and ‘Write attributes. Here’s what my testbed (yes, it’s a different name) looks like after stripping out the more “for beginner’s” comments:

Working.ads
With
LISP.Lists,
LISP.Elements,
Ada.Streams.Stream_IO, Ada.Text_IO.Text_Streams,
Ada.Text_IO;      -- Ada.Text_IO: exactly what it says.

-- This names the current compilation unit "working".
Procedure Working is

    Function Open( File_Name : String ) Return Ada.Text_IO.File_Type is
        Use Ada.Text_IO;
    begin
        Return Result : File_Type do
            Open(File => Result,
                  Mode => In_File,
                  Name => File_Name
             );
        End return;
    end open;

    Output_Stream : Ada.Text_IO.Text_Streams.Stream_Access :=
      Ada.Text_IO.Text_Streams.Stream( Ada.Text_IO.Standard_Output );

    Input_Stream :  Ada.Text_IO.Text_Streams.Stream_Access :=
      Ada.Text_IO.Text_Streams.Stream( Open("test.lsp") );

   -- Stubs for Read, Eval, and Print.

   -- Hack to terminate the loop; we will replace this later, with an
   -- orderly shutdown-mechanism.
   Loop_Count : Positive:= 1;

    Use LISP.Lists, LISP.Elements;

    -- Our working list.
    Working : List;

Procedure Read is
    Function Create_ID( ID : String ) Return LISP.Elements.Element is
      ( Create( Item => ID, Identifier => True) );
begin
  case Loop_Count is
    When 8  => Working.Prepend( Create_ID("+") );
    When 10 => Working:= Create_ID("define") &
                          Create_ID("bob") &
                          Create_ID("car") &
                          Create(
                            Create("Fire") &
                            Create("Ice") &
                            Create("Air")
                          );
              -- Above: (DEFINE BOB CAR ("Fire" "Ice" "Air"))
              -- Result: BOB => ("Fire" "Ice" "Air")
    When Others => Working.Append( Create( Loop_Count ) );
  end case;
end Read;

    Procedure Eval is
    begin
        -- Eval replaces our working list with the result of the evaluation.
        Working:= Working.Evaluate;
    end Eval;

    Procedure Print is
    begin
    -- To print we simply pass to Put_Line the result of Working.To_String
    -- Note:  When working with tagged-types we may use an alternitive to
    --        the more common object-dot-method; it can be called just like
    --        the subprogram it is.
        Ada.Text_IO.Put_Line( To_String(Working) );
    end Print;

    Sub_Loop : array (Positive Range <>) of Not Null Access Constant String:=
      ( New String'("car"), New String'("cdr"), New String'("cadr"),
        New String'("cdar"), New String'("caddr"), New String'("BOB") );

Begin
   REPL:
   loop
      Read;
      Eval;
      Print;
    -- Testing functions -- Between the call to print and the loop's exit.
    if Loop_Count = 10 then
        Working:= Create("A") & Create("B") & Create("C");
        For E of Sub_Loop loop
        declare
            Use Type LISP.Lists.Implementation.Vector;
            Temp : LIST := Create(E.All, True) & Working.Copy;
            Use Ada.Text_IO;
        begin
            Put_Line( ASCII.HT & Temp.To_String );
            Temp:= Temp.Evaluate;
            Put_Line( E.All & ASCII.HT & "=>" & Temp.To_String );
        end;
        End loop;
    end if;
      -- Set up the exit-condition; the name is optional but useful when dealing
      -- with nested loops.
      Exit REPL when Loop_Count = 10;
      -- Update the loop-counter;
      Loop_Count:= Loop_Count + 1;
   end loop REPL;

    Ada.Text_IO.Put_Line( "Stream output of Working:" );
    LIST'Write( Output_Stream, Working );
    Ada.Text_IO.New_Line;

    Ada.Text_IO.Put_Line( "Stream input of Working:" );
    LIST'Read( Input_Stream, Working );
    LIST'Write( Output_Stream, Working );
    Ada.Text_IO.New_Line;

   -- Print some text to indicate an orderly shutdown was achieved.
   Ada.Text_IO.Put_Line( "Goodbye." );

End Working;

When run with the input below, it produces the output shown.

Input (test.lsp) Output
(DEFINE maybe ("A", "B" ,"C" , "G"))
[Count: 1]
(1)
[Count: 2]
(1, 2)
[Count: 3]
(1, 2, 3)
[Count: 4]
(1, 2, 3, 4)
[Count: 5]
(1, 2, 3, 4, 5)
[Count: 6]
(1, 2, 3, 4, 5, 6)
[Count: 7]
(1, 2, 3, 4, 5, 6, 7)
[Count: 8]
(28)
[Count: 9]
(28, 9)
[Count: 10]
("BOB", "Defined")
    (car, "A", "B", "C")
car    =>("A")
    (cdr, "A", "B", "C")
cdr    =>("B", "C")
    (cadr, "A", "B", "C")
cadr    =>("B")
    (cdar, "A", "B", "C")
cdar    =>()
    (caddr, "A", "B", "C")
caddr    =>("C")
    (BOB, "A", "B", "C")
BOB    =>("Fire", "Ice", "Air")
Stream output of Working:
("A", "B", "C")
Stream input of Working:
(DEFINE, maybe, ("A", "B", "C", "G"))
Goodbye.

Everything there works as expected.

You won’t be able to just take the source from the previous examples and these, compile and run though. This is because I did some restructuring, in particular consolidating the internal functions for dealing with strings into a single child package which you may have noticed in the with clause of lisp-lists-read: LISP.Strings. I also separated the string_type and name_type fields so there is now an internal difference and strong-typing will prevent us from accidentally conflating the two.

Because all the changes would require showing the contents of everything you’ve already seen with relatively little change I will forego that in favor of leaving you with a link to my github page for this tutorial’s repository.

facebooktwittergoogle_plusredditpinterestlinkedinmail