The littlest CPU

Discussion in 'VHDL' started by rickman, Jul 19, 2008.

  1. rickman

    rickman Guest

    I may need to add a CPU to a design I am doing. I had rolled my own
    core once with a 16 bit data path and it worked out fairly well. But
    it was 600 LUT/FFs and I would like to use something smaller if
    possible. The target is a Lattice XP3 with about 3100 LUT/FFs and
    about 2000 are currently used. I believe that once I add the CPU
    core, I can take out a lot of the logic since it runs so slowly. The
    fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    100 Hz. I probably would have used a CPU to start with instead of the
    FPGA, but there was a possible need to handle higher speed signals
    which seems to have gone away.

    I recall that someone had started a thread about serial
    implementations of processors that were supported by a C compiler. I
    don't think any ever turned up. But the OP had some other
    requirements that may have excluded a few very small designs. Are
    there any CPU cores, serial or parallel, that are significantly
    smaller than 600 LUT/FFs? The Lattice part has LUT memory even dual
    port, so that is not a constraint, the LUTs can be used for
    registers.

    Rick
     
    rickman, Jul 19, 2008
    #1
    1. Advertising

  2. On Jul 18, 10:07 pm, rickman <> wrote:
    > I may need to add a CPU to a design I am doing.  I had rolled my own
    > core once with a 16 bit data path and it worked out fairly well.  But
    > it was 600 LUT/FFs and I would like to use something smaller if
    > possible.  The target is a Lattice XP3 with about 3100 LUT/FFs and
    > about 2000 are currently used.  I believe that once I add the CPU
    > core, I can take out a lot of the logic since it runs so slowly.  The
    > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > 100 Hz.  I probably would have used a CPU to start with instead of the
    > FPGA, but there was a possible need to handle higher speed signals
    > which seems to have gone away.
    >
    > I recall that someone had started a thread about serial
    > implementations of processors that were supported by a C compiler.  I
    > don't think any ever turned up.  But the OP had some other
    > requirements that may have excluded a few very small designs.  Are
    > there any CPU cores, serial or parallel, that are significantly
    > smaller than 600 LUT/FFs?  The Lattice part has LUT memory even dual
    > port, so that is not a constraint, the LUTs can be used for
    > registers.
    >
    > Rick



    The Xilinx PicoBlaze is less than 100 LUTs plus one block ram.
    Someone has been working on a simple C compiler for the PicoBlaze, but
    I have not tried it yet. I have used the PicoBlaze in many projects
    and I am quite happy with it.

    I have not used it, but Lattice has the Micro8. Have you looked at
    it? It has been mentioned here as the Lattice equivalent to the
    PicoBlaze.

    Regards,

    John McCaskill
    www.FasterTechnology.com
     
    John McCaskill, Jul 19, 2008
    #2
    1. Advertising

  3. On Jul 18, 11:09 pm, John McCaskill <> wrote:

    >
    > The Xilinx PicoBlaze is less than 100 LUTs plus one block ram.


    That should be less than 100 slices.

    Regards,

    John McCaskill
     
    John McCaskill, Jul 19, 2008
    #3
  4. rickman

    Guest

    If a 8 bits CPU is fine you may want to see my site. There is VHDL or
    verilog design. For this CPU it is easy to find free or non free
    tools. All is discussed in detail at:
    http://bknpk.no-ip.biz/usb_1.html

    "I used 8051 from http://www.cs.ucr.edu/~dalton/i8051/i8051syn. The
    VHDL code has been translated to verilog to avoid mix languages
    simulation. The cpu is also slightly modified to be able to use XILINX
    memories: for ROM I use..."



    On 19 יולי, 07:23, John McCaskill <> wrote:
    > On Jul 18, 11:09 pm, John McCaskill <> wrote:
    >
    >
    >
    > > The Xilinx PicoBlaze is less than 100 LUTs plus one block ram.

    >
    > That should be less than 100 slices.
    >
    > Regards,
    >
    > John McCaskill
     
    , Jul 19, 2008
    #4
  5. rickman

    Antti Guest

    On 19 juuli, 06:07, rickman <> wrote:
    > I may need to add a CPU to a design I am doing.  I had rolled my own
    > core once with a 16 bit data path and it worked out fairly well.  But
    > it was 600 LUT/FFs and I would like to use something smaller if
    > possible.  The target is a Lattice XP3 with about 3100 LUT/FFs and
    > about 2000 are currently used.  I believe that once I add the CPU
    > core, I can take out a lot of the logic since it runs so slowly.  The
    > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > 100 Hz.  I probably would have used a CPU to start with instead of the
    > FPGA, but there was a possible need to handle higher speed signals
    > which seems to have gone away.
    >
    > I recall that someone had started a thread about serial
    > implementations of processors that were supported by a C compiler.  I
    > don't think any ever turned up.  But the OP had some other
    > requirements that may have excluded a few very small designs.  Are
    > there any CPU cores, serial or parallel, that are significantly
    > smaller than 600 LUT/FFs?  The Lattice part has LUT memory even dual
    > port, so that is not a constraint, the LUTs can be used for
    > registers.
    >
    > Rick


    im OP

    hi I may have different interests, yes smallest nonserialized CPU
    as for your current task is one of the wishes, and here also there
    is no one definitive winner

    pico paco blazes and mico8 are out of the question, most others
    are too large

    i have used cut AVR core in XP3 but i dont recall the lut count

    Antti
     
    Antti, Jul 19, 2008
    #5
  6. rickman

    HT-Lab Guest

    "rickman" <> wrote in message
    news:...
    >I may need to add a CPU to a design I am doing. I had rolled my own
    > core once with a 16 bit data path and it worked out fairly well. But
    > it was 600 LUT/FFs and I would like to use something smaller if
    > possible. The target is a Lattice XP3 with about 3100 LUT/FFs and
    > about 2000 are currently used. I believe that once I add the CPU
    > core, I can take out a lot of the logic since it runs so slowly. The
    > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > 100 Hz. I probably would have used a CPU to start with instead of the
    > FPGA, but there was a possible need to handle higher speed signals
    > which seems to have gone away.
    >
    > I recall that someone had started a thread about serial
    > implementations of processors that were supported by a C compiler. I
    > don't think any ever turned up. But the OP had some other
    > requirements that may have excluded a few very small designs. Are
    > there any CPU cores, serial or parallel, that are significantly
    > smaller than 600 LUT/FFs?


    I would suggest you check out one of the many free PIC cores available on
    the web. The reason for suggesting PIC is that it is accompanied by a
    processional IDE from Microchip. Developing a processor is easy and the web
    is full of wonderful and clever implementation but at the end of the day if
    you have to develop some software you need a good IDE.

    I just tried a quick push-button synthesis of a 16C54,

    ***********************************************
    Device Utilization for LFXP3C/PQFP208
    ***********************************************
    Resource Used Avail Utilization
    -----------------------------------------------
    LUTs 374 3072 12.17%
    Flipflops 83 3072 2.70%
    Block RAMs 0 6 0.00%
    IOs 67 136 49.26%
    -----------------------------------------------

    Hans
    www.ht-lab.com



    > The Lattice part has LUT memory even dual
    > port, so that is not a constraint, the LUTs can be used for
    > registers.
    >
    > Rick
     
    HT-Lab, Jul 19, 2008
    #6
  7. rickman

    jeppe

    Joined:
    Mar 10, 2008
    Messages:
    348
    Location:
    Denmark
    jeppe, Jul 19, 2008
    #7
  8. rickman

    tanukichu

    Joined:
    Mar 22, 2008
    Messages:
    6
    if you need a very small cpu you could try Quattro
    http://gle-mips.googlecode.com/files/quattro-1.0.tar.bz2

    it's a 4-bit processor, with instructions encoded in 8 bits
    it has add, sub, mul, jmp, ld/st operations and a small cache
    runs at 120MHz, on a spartan3

    hope it's useful :)
    bye
    Koda
     
    tanukichu, Jul 19, 2008
    #8
  9. rickman

    rickman Guest

    On Jul 19, 2:57 am, Antti <> wrote:
    > On 19 juuli, 06:07, rickman <> wrote:
    >
    >
    >
    > > I may need to add a CPU to a design I am doing. I had rolled my own
    > > core once with a 16 bit data path and it worked out fairly well. But
    > > it was 600 LUT/FFs and I would like to use something smaller if
    > > possible. The target is a Lattice XP3 with about 3100 LUT/FFs and
    > > about 2000 are currently used. I believe that once I add the CPU
    > > core, I can take out a lot of the logic since it runs so slowly. The
    > > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > > 100 Hz. I probably would have used a CPU to start with instead of the
    > > FPGA, but there was a possible need to handle higher speed signals
    > > which seems to have gone away.

    >
    > > I recall that someone had started a thread about serial
    > > implementations of processors that were supported by a C compiler. I
    > > don't think any ever turned up. But the OP had some other
    > > requirements that may have excluded a few very small designs. Are
    > > there any CPU cores, serial or parallel, that are significantly
    > > smaller than 600 LUT/FFs? The Lattice part has LUT memory even dual
    > > port, so that is not a constraint, the LUTs can be used for
    > > registers.

    >
    > > Rick

    >
    > im OP
    >
    > hi I may have different interests, yes smallest nonserialized CPU
    > as for your current task is one of the wishes, and here also there
    > is no one definitive winner
    >
    > pico paco blazes and mico8 are out of the question, most others
    > are too large
    >
    > i have used cut AVR core in XP3 but i dont recall the lut count


    Have you tabulated your findings anywhere? The last time I did a
    survey of ARM7 processors, I put it all into a spread sheet and posted
    it on the web. I think it was useful for a while, but the market
    overtook it and I couldn't keep up!

    I read your thread about the serial processor and it was interesting.
    I think my project actually has the time to use such a processor, but
    I think you never found one that met your requirements.

    I am not looking for a large address space, but I would like for it to
    be able to read data from an SD card. My design uses FPGAs both on
    the application board and the test fixture. Ultimately I want the
    test fixture to be able to read a programming file from an SD card and
    configure the target FGPA without a programming cable.

    Of all the suggestions, so far the PIC sounds like the best one. I
    couldn't find a C compiler for the picoblaze or the pacoblaze. There
    is mention of someone creating one, but the web site is no longer
    accessible.

    Rick
     
    rickman, Jul 20, 2008
    #9
  10. rickman

    rickman Guest

    On Jul 19, 12:23 am, John McCaskill <> wrote:
    > On Jul 18, 11:09 pm, John McCaskill <> wrote:
    >
    >
    >
    > > The Xilinx PicoBlaze is less than 100 LUTs plus one block ram.

    >
    > That should be less than 100 slices.


    Still, that's 200 LUTs which is very small. But I can't find a C
    compiler for it.

    Rick
     
    rickman, Jul 20, 2008
    #10
  11. rickman

    Antti Guest

    On 20 juuli, 06:58, rickman <> wrote:
    > On Jul 19, 2:57 am, Antti <> wrote:
    >
    >
    >
    > > On 19 juuli, 06:07, rickman <> wrote:

    >
    > > > I may need to add a CPU to a design I am doing.  I had rolled my own
    > > > core once with a 16 bit data path and it worked out fairly well.  But
    > > > it was 600 LUT/FFs and I would like to use something smaller if
    > > > possible.  The target is a Lattice XP3 with about 3100 LUT/FFs and
    > > > about 2000 are currently used.  I believe that once I add the CPU
    > > > core, I can take out a lot of the logic since it runs so slowly.  The
    > > > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > > > 100 Hz.  I probably would have used a CPU to start with instead of the
    > > > FPGA, but there was a possible need to handle higher speed signals
    > > > which seems to have gone away.

    >
    > > > I recall that someone had started a thread about serial
    > > > implementations of processors that were supported by a C compiler.  I
    > > > don't think any ever turned up.  But the OP had some other
    > > > requirements that may have excluded a few very small designs.  Are
    > > > there any CPU cores, serial or parallel, that are significantly
    > > > smaller than 600 LUT/FFs?  The Lattice part has LUT memory even dual
    > > > port, so that is not a constraint, the LUTs can be used for
    > > > registers.

    >
    > > > Rick

    >
    > > im OP

    >
    > > hi I may have different interests, yes smallest nonserialized CPU
    > > as for your current task is one of the wishes, and here also there
    > > is no one definitive winner

    >
    > > pico paco blazes and mico8 are out of the question, most others
    > > are too large

    >
    > > i have used cut AVR core in XP3 but i dont recall the lut count

    >
    > Have you tabulated your findings anywhere?  The last time I did a
    > survey of ARM7 processors, I put it all into a spread sheet and posted
    > it on the web.  I think it was useful for a while, but the market
    > overtook it and I couldn't keep up!
    >
    > I read your thread about the serial processor and it was interesting.
    > I think my project actually has the time to use such a processor, but
    > I think you never found one that met your requirements.
    >
    > I am not looking for a large address space, but I would like for it to
    > be able to read data from an SD card.  My design uses FPGAs both on
    > the application board and the test fixture.  Ultimately I want the
    > test fixture to be able to read a programming file from an SD card and
    > configure the target FGPA without a programming cable.
    >
    > Of all the suggestions, so far the PIC sounds like the best one.  I
    > couldn't find a C compiler for the picoblaze or the pacoblaze.  There
    > is mention of someone creating one, but the web site is no longer
    > accessible.
    >
    > Rick


    Hi Rick here is reply to your post :)
    http://antti-lukats.blogspot.com/2008/07/rules-of-life.html

    in short i am doing almost the same as you intend to at the moment

    Antti
     
    Antti, Jul 20, 2008
    #11
  12. rickman

    Josep Duran Guest

    >
    > Of all the suggestions, so far the PIC sounds like the best one. I
    > couldn't find a Ccompilerfor thepicoblazeor the pacoblaze. There
    > is mention of someone creating one, but the web site is no longer
    > accessible.
    >
    > Rick


    You can find a download link here :

    http://www.asm.ro/fpga/

    Disclaimer : I never used it myself


    Josep
     
    Josep Duran, Jul 20, 2008
    #12
  13. rickman

    Henri Guest

    On 19.7.2008 6:07, rickman wrote:
    > I may need to add a CPU to a design I am doing. I had rolled my own
    > core once with a 16 bit data path and it worked out fairly well. But
    > it was 600 LUT/FFs and I would like to use something smaller if
    > possible. The target is a Lattice XP3 with about 3100 LUT/FFs and
    > about 2000 are currently used. I believe that once I add the CPU
    > core, I can take out a lot of the logic since it runs so slowly. The
    > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > 100 Hz. I probably would have used a CPU to start with instead of the
    > FPGA, but there was a possible need to handle higher speed signals
    > which seems to have gone away.
    >
    > I recall that someone had started a thread about serial
    > implementations of processors that were supported by a C compiler. I
    > don't think any ever turned up. But the OP had some other
    > requirements that may have excluded a few very small designs. Are
    > there any CPU cores, serial or parallel, that are significantly
    > smaller than 600 LUT/FFs? The Lattice part has LUT memory even dual
    > port, so that is not a constraint, the LUTs can be used for
    > registers.
    >
    > Rick


    Maybe something worth checking:

    http://www.zylin.com/zpu.htm

    From the above website:

    1. The ZPU is now open source. See ZPU mailing list for more details.
    2. BSD license for HDL implementations--no hiccups when using in
    proprietary commercial products. Under the open source royalty free
    license, there are no limits on what type of technology (FPGA,
    anti-fuse, or ASIC) in which the ZPU can be implemented.
    3. GPL license for architecture, documentation and tools
    4. Completely FPGA brand and type neutral implementation
    5. 298 LUT @ 125 MHz after P&R with 16 bit datapath and 4kBytes BRAM
    6. 442 LUT @ 95 MHz after P&R with 32 bit datapath and 32kBytes BRAM
    7. Codesize 80% of ARM thumb
    8. Configurable 16/32 bit datapath
    9. GCC toolchain(GDB, newlib, libstdc++)
    10. Debugging via simulator or GDB stubs
    11. HDL simulation feedback to simulator for powerful profiling
    capabilities
    12. Eclipse ZPU plug-in
    13. eCos embedded operating system support.



    Henri
     
    Henri, Jul 20, 2008
    #13
  14. rickman

    Antti Guest

    On 20 juuli, 15:21, Henri <> wrote:
    > On 19.7.2008 6:07, rickman wrote:
    >
    >
    >
    > > I may need to add a CPU to a design I am doing.  I had rolled my own
    > > core once with a 16 bit data path and it worked out fairly well.  But
    > > it was 600 LUT/FFs and I would like to use something smaller if
    > > possible.  The target is a Lattice XP3 with about 3100 LUT/FFs and
    > > about 2000 are currently used.  I believe that once I add the CPU
    > > core, I can take out a lot of the logic since it runs so slowly.  The
    > > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > > 100 Hz.  I probably would have used a CPU to start with instead of the
    > > FPGA, but there was a possible need to handle higher speed signals
    > > which seems to have gone away.

    >
    > > I recall that someone had started a thread about serial
    > > implementations of processors that were supported by a C compiler.  I
    > > don't think any ever turned up.  But the OP had some other
    > > requirements that may have excluded a few very small designs.  Are
    > > there any CPU cores, serial or parallel, that are significantly
    > > smaller than 600 LUT/FFs?  The Lattice part has LUT memory even dual
    > > port, so that is not a constraint, the LUTs can be used for
    > > registers.

    >
    > > Rick

    >
    > Maybe something worth checking:
    >
    > http://www.zylin.com/zpu.htm
    >
    >  From the above website:
    >
    >     1.   The ZPU is now open source. See ZPU mailing list for more details.
    >     2. BSD license for HDL implementations--no hiccups when using in
    > proprietary commercial products. Under the open source royalty free
    > license, there are no limits on what type of technology (FPGA,
    > anti-fuse, or ASIC) in which the ZPU can be implemented.
    >     3. GPL license for architecture, documentation and tools
    >     4. Completely FPGA brand and type neutral implementation
    >     5. 298 LUT @ 125 MHz after P&R with 16 bit datapath and 4kBytes BRAM
    >     6. 442 LUT @ 95 MHz after P&R with 32 bit datapath and 32kBytes BRAM
    >     7. Codesize 80% of ARM thumb
    >     8. Configurable 16/32 bit datapath
    >     9. GCC toolchain(GDB, newlib, libstdc++)
    >    10. Debugging via simulator or GDB stubs
    >    11. HDL simulation feedback to simulator for powerful profiling
    > capabilities
    >    12. Eclipse ZPU plug-in
    >    13. eCos embedded operating system support.
    >
    > Henri


    eh this is still on my MUST evaluate plan :)

    80% of THUMB? that nice also, i just made my first THUMB assembly
    program
    Atmel dataflash bootstrap loader, its about 60 bytes of code (thumb)
    would be fun to compare if that optimized to max thumb code still
    compacts on zpu :)
    my code is really funky it loads 1 32 bit constant and constructs all
    other constants, also uses lower port of io address as mask constant,
    etc..

    Antti
     
    Antti, Jul 20, 2008
    #14
  15. rickman

    rickman Guest

    On Jul 20, 8:21 am, Henri <> wrote:
    > On 19.7.2008 6:07, rickman wrote:
    >
    >
    >
    > > I may need to add a CPU to a design I am doing. I had rolled my own
    > > core once with a 16 bit data path and it worked out fairly well. But
    > > it was 600 LUT/FFs and I would like to use something smaller if
    > > possible. The target is a Lattice XP3 with about 3100 LUT/FFs and
    > > about 2000 are currently used. I believe that once I add the CPU
    > > core, I can take out a lot of the logic since it runs so slowly. The
    > > fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    > > 100 Hz. I probably would have used a CPU to start with instead of the
    > > FPGA, but there was a possible need to handle higher speed signals
    > > which seems to have gone away.

    >
    > > I recall that someone had started a thread about serial
    > > implementations of processors that were supported by a C compiler. I
    > > don't think any ever turned up. But the OP had some other
    > > requirements that may have excluded a few very small designs. Are
    > > there any CPU cores, serial or parallel, that are significantly
    > > smaller than 600 LUT/FFs? The Lattice part has LUT memory even dual
    > > port, so that is not a constraint, the LUTs can be used for
    > > registers.

    >
    > > Rick

    >
    > Maybe something worth checking:
    >
    > http://www.zylin.com/zpu.htm
    >
    > From the above website:
    >
    > 1. The ZPU is now open source. See ZPU mailing list for more details.
    > 2. BSD license for HDL implementations--no hiccups when using in
    > proprietary commercial products. Under the open source royalty free
    > license, there are no limits on what type of technology (FPGA,
    > anti-fuse, or ASIC) in which the ZPU can be implemented.
    > 3. GPL license for architecture, documentation and tools
    > 4. Completely FPGA brand and type neutral implementation
    > 5. 298 LUT @ 125 MHz after P&R with 16 bit datapath and 4kBytes BRAM
    > 6. 442 LUT @ 95 MHz after P&R with 32 bit datapath and 32kBytes BRAM
    > 7. Codesize 80% of ARM thumb
    > 8. Configurable 16/32 bit datapath
    > 9. GCC toolchain(GDB, newlib, libstdc++)
    > 10. Debugging via simulator or GDB stubs
    > 11. HDL simulation feedback to simulator for powerful profiling
    > capabilities
    > 12. Eclipse ZPU plug-in
    > 13. eCos embedded operating system support.
    >
    > Henri


    I'm pretty impressed. Small, fast and with GCC support!

    Rick
     
    rickman, Jul 20, 2008
    #15
  16. The '16 Bit Microcontroller' at Opencores by Dr. Juergen Sauermann is
    also an impressive piece of work.

    rickman wrote:
    > On Jul 20, 8:21 am, Henri <> wrote:
    >> On 19.7.2008 6:07, rickman wrote:
    >>
    >>
    >>
    >>> I may need to add a CPU to a design I am doing. I had rolled my own
    >>> core once with a 16 bit data path and it worked out fairly well. But
    >>> it was 600 LUT/FFs and I would like to use something smaller if
    >>> possible. The target is a Lattice XP3 with about 3100 LUT/FFs and
    >>> about 2000 are currently used. I believe that once I add the CPU
    >>> core, I can take out a lot of the logic since it runs so slowly. The
    >>> fastest parallel data rate is 8 kHz with some at 1 kHz and the rest at
    >>> 100 Hz. I probably would have used a CPU to start with instead of the
    >>> FPGA, but there was a possible need to handle higher speed signals
    >>> which seems to have gone away.
    >>> I recall that someone had started a thread about serial
    >>> implementations of processors that were supported by a C compiler. I
    >>> don't think any ever turned up. But the OP had some other
    >>> requirements that may have excluded a few very small designs. Are
    >>> there any CPU cores, serial or parallel, that are significantly
    >>> smaller than 600 LUT/FFs? The Lattice part has LUT memory even dual
    >>> port, so that is not a constraint, the LUTs can be used for
    >>> registers.
    >>> Rick

    >> Maybe something worth checking:
    >>
    >> http://www.zylin.com/zpu.htm
    >>
    >> From the above website:
    >>
    >> 1. The ZPU is now open source. See ZPU mailing list for more details.
    >> 2. BSD license for HDL implementations--no hiccups when using in
    >> proprietary commercial products. Under the open source royalty free
    >> license, there are no limits on what type of technology (FPGA,
    >> anti-fuse, or ASIC) in which the ZPU can be implemented.
    >> 3. GPL license for architecture, documentation and tools
    >> 4. Completely FPGA brand and type neutral implementation
    >> 5. 298 LUT @ 125 MHz after P&R with 16 bit datapath and 4kBytes BRAM
    >> 6. 442 LUT @ 95 MHz after P&R with 32 bit datapath and 32kBytes BRAM
    >> 7. Codesize 80% of ARM thumb
    >> 8. Configurable 16/32 bit datapath
    >> 9. GCC toolchain(GDB, newlib, libstdc++)
    >> 10. Debugging via simulator or GDB stubs
    >> 11. HDL simulation feedback to simulator for powerful profiling
    >> capabilities
    >> 12. Eclipse ZPU plug-in
    >> 13. eCos embedded operating system support.
    >>
    >> Henri

    >
    > I'm pretty impressed. Small, fast and with GCC support!
    >
    > Rick
     
    Robert F. Jarnot, Jul 23, 2008
    #16
  17. rickman

    rickman Guest

    On Jul 23, 5:26 pm, "Robert F. Jarnot" <>
    wrote:
    > The '16 Bit Microcontroller' at Opencores by Dr. Juergen Sauermann is
    > also an impressive piece of work.


    Can you tell us what you find impressive about it? I took a look and
    it is listed as 800 slices which means it can be as big as 1600 LUTs.
    That is over three times the size of my CPU and an even larger ratio
    compared to the ZPU and others.

    Is it the fact that it has a C compiler and a simulator?

    Rick
     
    rickman, Jul 23, 2008
    #17
  18. What impresses me about this design is the approach -- determine what
    kind of architecture a 'clean' compiler would like to see, and implement
    the corresponding hardware and compiler. Throwing in an RTOS is a nice
    bonus too.

    I agree that your design is very impressive, both in resource usage and
    performance. I like some of the architectural details too, especially
    those borrowed from the transputer (looking back to the transputer for
    ideas is a good idea in my opinion). Having GCC support is a big plus
    too. What I do not have a feeling for is the relative performance of
    the two designs -- do you have any feeling for this?

    (Note to rickman: my initial reply was directly to you, not the
    newsgroup. Sorry. This reply is very similar to the one I sent you
    directly)


    rickman wrote:
    > On Jul 23, 5:26 pm, "Robert F. Jarnot" <>
    > wrote:
    >> The '16 Bit Microcontroller' at Opencores by Dr. Juergen Sauermann is
    >> also an impressive piece of work.

    >
    > Can you tell us what you find impressive about it? I took a look and
    > it is listed as 800 slices which means it can be as big as 1600 LUTs.
    > That is over three times the size of my CPU and an even larger ratio
    > compared to the ZPU and others.
    >
    > Is it the fact that it has a C compiler and a simulator?
    >
    > Rick
     
    Robert F. Jarnot, Jul 23, 2008
    #18
  19. rickman

    rickman Guest

    On Jul 23, 6:57 pm, "Robert F. Jarnot" <>
    wrote:
    > What impresses me about this design is the approach -- determine what
    > kind of architecture a 'clean' compiler would like to see, and implement
    > the corresponding hardware and compiler. Throwing in an RTOS is a nice
    > bonus too.
    >
    > I agree that your design is very impressive, both in resource usage and
    > performance. I like some of the architectural details too, especially
    > those borrowed from the transputer (looking back to the transputer for
    > ideas is a good idea in my opinion). Having GCC support is a big plus
    > too. What I do not have a feeling for is the relative performance of
    > the two designs -- do you have any feeling for this?
    >
    > (Note to rickman: my initial reply was directly to you, not the
    > newsgroup. Sorry. This reply is very similar to the one I sent you
    > directly)


    No problem. I was waiting for this one to appear so I could respond
    in public. I think there is some interest in the discussion.

    Yes, once I had a chance to look a bit more at the docs, I see the
    history and I also like the idea. I'm not sure why it is so large
    though. His design sounds simple with few registers and not even an
    internal stack if I understand correctly. The various Forth like CPUs
    all have one if not two internal stacks which in effect are local
    memories (in FPGA implementations). I expect (without looking at the
    design in detail) that this design suffers somewhat in speed in that
    things are done sequentially that can be done in parallel in other
    processors. But then those "other" processors are not built to run
    C. So I expect any fair comparison needs to take that into account.

    I can't say my design is impressive really. It is not complete in
    that there are no tools of any sort. I made a crude assembler but
    mostly hand coded in machine language. So I don't really have any
    idea of how fast it would run an application written in a high level
    language. I like to think that it would handle Forth pretty well, but
    I have not spent the time to really get that underway.

    I did see that the C16 (that is Dr. Juergen Sauermann's CPU name) is
    constructed somewhat like the 8080. That processor had a three
    machine cycle instruction timing and may have also used two input
    clocks for each machine cycle (this is really stretching my wayback
    machine). I remember this partly because I have an 8008 computer
    which was the predecessor to the 8080. It used the three machine
    cycles because it only had an 8 bit multiplexed bus. It used two
    cycles to output a 14 bit address (IIRC) and the third cycle was for
    the 8 bits of data. Every instruction was built of these three
    machine cycle memory ops (even if it was a register transfer).

    His machine seems to have emulated that and so uses up to 6 clock
    cycles for a basic instruction. I don't know much about the ZPU, but
    my CPU uses one clock cycle for any instruction other than program
    memory reads which require three cycles.

    You like the variable length literal instructions ala the Transputer?
    They are used to set up the immediate addresses for jumps and calls
    too. Unfortunately this makes for some trouble with defining
    addresses in the assembler. I never did get that to work correctly.
    Every time a byte was added or subtracted from the opcodes, it would
    move all of the other labels and you had to start over with the
    calculations. I think you could have situations that never
    converged.

    Otherwise I was pretty happy with my CPU. But I don't want to
    continue using it if there are better CPUs available. But it will be
    a couple of weeks before I can really spend any time on this.

    Rick
     
    rickman, Jul 24, 2008
    #19
  20. rickman wrote:
    > On Jul 23, 6:57 pm, "Robert F. Jarnot" <>
    > wrote:
    >> What impresses me about this design is the approach -- determine what
    >> kind of architecture a 'clean' compiler would like to see, and implement
    >> the corresponding hardware and compiler. Throwing in an RTOS is a nice
    >> bonus too.
    >>
    >> I agree that your design is very impressive, both in resource usage and
    >> performance. I like some of the architectural details too, especially
    >> those borrowed from the transputer (looking back to the transputer for
    >> ideas is a good idea in my opinion). Having GCC support is a big plus
    >> too. What I do not have a feeling for is the relative performance of
    >> the two designs -- do you have any feeling for this?
    >>
    >> (Note to rickman: my initial reply was directly to you, not the
    >> newsgroup. Sorry. This reply is very similar to the one I sent you
    >> directly)

    >
    > No problem. I was waiting for this one to appear so I could respond
    > in public. I think there is some interest in the discussion.
    >
    > Yes, once I had a chance to look a bit more at the docs, I see the
    > history and I also like the idea. I'm not sure why it is so large
    > though. His design sounds simple with few registers and not even an
    > internal stack if I understand correctly. The various Forth like CPUs
    > all have one if not two internal stacks which in effect are local
    > memories (in FPGA implementations). I expect (without looking at the
    > design in detail) that this design suffers somewhat in speed in that
    > things are done sequentially that can be done in parallel in other
    > processors. But then those "other" processors are not built to run
    > C. So I expect any fair comparison needs to take that into account.
    >
    > I can't say my design is impressive really. It is not complete in
    > that there are no tools of any sort. I made a crude assembler but
    > mostly hand coded in machine language. So I don't really have any
    > idea of how fast it would run an application written in a high level
    > language. I like to think that it would handle Forth pretty well, but
    > I have not spent the time to really get that underway.
    >
    > I did see that the C16 (that is Dr. Juergen Sauermann's CPU name) is
    > constructed somewhat like the 8080. That processor had a three
    > machine cycle instruction timing and may have also used two input
    > clocks for each machine cycle (this is really stretching my wayback
    > machine). I remember this partly because I have an 8008 computer
    > which was the predecessor to the 8080. It used the three machine
    > cycles because it only had an 8 bit multiplexed bus. It used two
    > cycles to output a 14 bit address (IIRC) and the third cycle was for
    > the 8 bits of data. Every instruction was built of these three
    > machine cycle memory ops (even if it was a register transfer).
    >
    > His machine seems to have emulated that and so uses up to 6 clock
    > cycles for a basic instruction. I don't know much about the ZPU, but
    > my CPU uses one clock cycle for any instruction other than program
    > memory reads which require three cycles.
    >
    > You like the variable length literal instructions ala the Transputer?
    > They are used to set up the immediate addresses for jumps and calls
    > too. Unfortunately this makes for some trouble with defining
    > addresses in the assembler. I never did get that to work correctly.
    > Every time a byte was added or subtracted from the opcodes, it would
    > move all of the other labels and you had to start over with the
    > calculations. I think you could have situations that never
    > converged.
    >
    > Otherwise I was pretty happy with my CPU. But I don't want to
    > continue using it if there are better CPUs available. But it will be
    > a couple of weeks before I can really spend any time on this.
    >
    > Rick


    Yes, I like the idea of prefix instructions -- I am a believer in
    compact instruction sets, even if it makes the CPU slightly more
    complex. The transputer linker had the same issues you allude with
    yours -- the linker would sometimes have to make many 10's, or even a
    few hundred passes (for a large program) to make all of the variable
    length prefix instructions as short as possible. That is probably one
    of the reasons that the successor to the transputer from www.xmos.com
    looks much more like a modern register-based architecture with a lot of
    other clever transputer features retained or extended. Sauermann
    started with the 8080/Z80 only to come across the poor match to a C
    compiler. Since this was his starting point, I am not surprised that
    his final design shows some heritage from these designs. I would be
    very interested in knowing how your design fares with a C compiler (if
    someone smarter than me has the strength to do the port).
     
    Robert F. Jarnot, Jul 24, 2008
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ¹Ú»óÈñ
    Replies:
    0
    Views:
    942
    ¹Ú»óÈñ
    Dec 22, 2003
  2. John

    Book on CPU Design

    John, Jun 29, 2003, in forum: VHDL
    Replies:
    3
    Views:
    13,518
    john jakson
    Jul 5, 2003
  3. Attila Csosz

    help in cpu design

    Attila Csosz, Aug 16, 2003, in forum: VHDL
    Replies:
    7
    Views:
    1,510
    Pieter Hulshoff
    Aug 17, 2003
  4. Norbert Hoppe

    Interface on CPU data bus

    Norbert Hoppe, Oct 28, 2004, in forum: VHDL
    Replies:
    0
    Views:
    1,690
    Norbert Hoppe
    Oct 28, 2004
  5. pavunkumar

    How , system cpu and user cpu times calculates

    pavunkumar, Feb 27, 2009, in forum: C Programming
    Replies:
    1
    Views:
    378
Loading...

Share This Page