A GPU for ultra-low power, cost efficient and small display applications

  • NEMA®|p is the smallest member of the NEMA®|GPU-Series, and has been specifically designed to serve the need to build economically smart SoC's to drive small yet vibrant display applications.

    Developing those area constrained applications can be challenging and a cost-efficient graphic solution is a key element to succeed but without sacrificing graphics performance and dispense the ability of ultra-low power consumption. NEMA®|p has an incredibly small silicon footprint of just 0.084mm2 (250MHz @ 28nm), with leakage power GPU consumption of just 0.008mW. Implementing Think Silicon's proprietary framebuffer compression technology (TSC™FB) limits memory power consumption to just 1.2mW (in DDR-less systems).

    It features a programmable 2D graphics-rendering engine and has innovative composition functionalities, while supporting Linux, RTOS, and Graphics APIs, like Qt and DirectFB. Bare metal library enables graphics applications development in embedded systems with no Operating System.

    Application and Markets

    NEMA®|p is the perfect candidate to support entry level IoT-platforms, wearable and embedded devices with low cost and ultra-low power requirements supporting SoC's with a 32-bit MCU (e.g. ARM® Cortex® M processors) and provide fluid 2D graphics experience for a wide range of applications.

    Developers are able to create compelling 2D Graphical User Interfaces (GUIs) and software applications with ultra-long battery life at a significantly lower cost for power-memory-area constrained IoT devices.

    For example, with core frequencies just as little as 25MHz, NEMA®|p delivers a fast and brilliant 2D UI experience on a VGA (640x480) screen, without being limited to these parameters.

    Performance per mWatt per Dollar

    The NEMA®|GPU-Series performance per silicon area per clock frequency is unrivalled in its class. NEMA®|p has been designed to perform favorable against these critical performance benchmarks. As a result NEMA®|p uses 87% less active and 98% less idle power and has a 4 times smaller silicon footprint (compared to the competition), leading to significant cost reduction.

    Think Silicon's proprietary 4bpp (bits-per-pixel) real-time frame-buffer compression (TSC™FB), the 6bpp texture compression and real-time de-compression (TSC™T) benefits architects and finance controllers equally. The compressed images and the software libraries are so small in size that they fit into the internal SoC memory. As a result, expensive external DDR memory can be minimized or entirely eliminated. This reduces the SoC idle power consumption about impressive 98%, extends system battery life about 10 times and lowers the overall BOM cost. The combined performance and cost advantages make NEMA®|p to a Performance-Power-Cost leader in the class of 2D GPU's.

    Architecture

    NEMA®|p has been designed for graphics efficiency in ultra-compact silicon area and it is the smallest member of NEMA® family. Its fixed point data path and instruction set architecture (ISA) are tailored to GUIs acceleration and small display applications leading to substantial improvements in power consumption and silicon area.

    NEMA®|p microarchitecture is based on a lean version of NEMA® ISA and it combines hardware-level support for multi-threading, VLIW and low-level vector processing in the most power efficient way.

    nema_architecture

    Integration/verification

    The NEMA®|p GPU IP Platform is available in Verilog and easy to integrate and verify. NEMA®|p ASIC reference designs have been evaluated in various process technologies. NEMA®|p is designed with AMBA interfaces (AHB, AXI 32 or 64 bits) and embeds DMA controllers and command lists for minimal CPU overhead. The core has been verified through extensive simulation and rigorous code coverage measurements. It comes together with a complete verification suite that ensures correct migration to target technology node.

    Deliverables and Documentation*
    Deliverables include: complete set of synthesis and STA (Static Timing Analysis) scripts, OS drivers (for Linux, RTOS), graphics API software libraries (for NEMA®|GFX-API, DirectFB, Qt) and standalone bare drivers.

    Documentation includes: IP manual, integration manual, software library manual including code samples, and demonstration platform application notes.

    A reference design systems and demo-sets are available for platforms: Xilinx Zynq, Altera SoCkit.

    Listed items re-presenting a super-set and are subject to change without further notice.
    Listed items could be a part of a unified product part number and may or may not be listed under a separate part number.
    Listed items are not subject of an official quote unless listed in such.

    • Fully programmable engine with a VLIW instruction set
    • Fixed point functional units
    • Command list based DMAs to minimize CPU overhead
    • Compression schemes (Texture and Framebuffer)
      • Framebuffer TSCTM4 (4 bits per pixel)
      • Texture TSCTM6 / TSCTM6a (6 bits per pixel without/with alpha
    • 2D drawing
      • Pixel / Line drawing
      • Filled rectangles
      • Triangles (Gouraud Shaded)
      • Quadrilaterals
      • Antialiasing 8x MSAA
    • Blit support
      • Rotation, Mirroring
      • Stretch (independently on x and y axis)
      • Source and/or destination color keying
      • Format conversion
      • RGB and YUV
    • Image Transformation
      • 3D Perspective Correct Projections
      • Texture mapping
        • Point sampling
        • Bilinear filtering
    • Text rendering supports
      • A1 bitmap
      • A8 bitmap antialiased
      • Subsampled antialiased
    • Color formats
      • 32/16/8 bit RGB with/out alpha
      • GreyScale
      • YUV
      • RGBA
    • Full Alpha blending
      • Programmable blending modes
      • Source/Destination color keying

    Configuration Options

    • Compression Schemes
      • TSCTM4, TSCTM6, TSCTM6a
    • Cache Sizes
    • Master Interface
      • AMBA AHB 32bit ,
      • AMBA AXI4 32/ 64bit
    • Slave Interface
      • AMBA AHB
      • AMBA AXI4-Lite


  • NEMA®|p supports RTOS and Linux operating systems and is shipped with Software Libraries for 2D Graphics APIs. DirectFB and RTOS support makes it an ideal solution for software development with application and Graphic User Interface (GUI) creation frameworks, such as Qt. A bare metal library of primitive graphics functions enables graphics development for embedded applications.

    • OS support
      • RTOS (NEMA®|GFX Library in C)
      • BareMetal (no OS)
      • Linux
    • Graphics API support
      • NEMA®|GFX-API library in pure C
      • DirectFB
      • Qt
    • Software Development Kit
      • NEMA®|PIX-Presso
      • NEMA®|GUI-Builder*
      • NEMA®|SHADER-Edit
      • NEMA®|Bits

    * NEMA|GUI-Builder (non-commercial version)

  • NEMA®|p delivers stunning performance per silicon area per clock frequency.

    pico 2D GPU NEMA®|p-100
    GPU cores 1
    Silicon area (mm2@28nm) 0.084
    Core clock (MHz @28nm) 250
    Shader (GOPS) 1.8
    Pixel Rate @200MHz (Mpixel/sec) 250

    NEMA®|p running DirectFB demos

For additional information, download the NEMA®|p Product Brief adobe-pdf-document-icon