Tag Archives: high speed PCB design

Profile of an IoT processor for the industrial and consumer markets


 If there’s a single major stumbling block that is hindering the IoT take-off at the larger industrial scale, it’s security.


The intersection of data with intelligent machines is creating new possibilities in industrial automation, and this new frontier is now being increasingly known as the Industrial Internet of Things (IIoT). However, if there is a single major stumbling block that is hindering the IoT take-off at the larger industrial scale, it’s security.

It’s imperative to have reliable data in the industrial automation environment, and here, the additional security layers in the IoT hardware often lead to compromises in performance. Then, there is counterfeiting of products and application software, which is becoming a growing concern in the rapidly expanding IoT market.

sama5d2_google_1160x805_090215

Atmel’s answer to security concerns in the IIoT infrastructure: a microprocessor (MPU) that can deliver the security while maintaining the level of performance that Internet-connected systems require. The company’s Cortex A5 chip — the Atmel | SMART SAMA5D4 — securely stores and transfers data, as well as safeguards software assets to prevent cloning of IoT applications.

The SAMA5D4 series of MPUs enables on-the-fly encryption and decryption of software code from the external DRAM. Moreover, it boasts security features such as secure boot, tamper detection pins and safe erasure of security-critical data. The A5D4 processor also incorporates ARM’s system-wide security approach, TrustZone, which is used to secure peripherals such as memory and crypto blocks. TrustZone —comprising of security extensions that can be implemented in a number of ARM cores — is tightly integrated into ARM’s Cortex-A processors. It runs the processor in two different modes: First, a secure environment executes critical security and safety software, and secondly, a normal environment runs the rich OS software applications such as Linux. This lets embedded designers isolate critical software from OS software.

The system approach allows control access to CPU, memories, DMA and peripherals with programmable secure regions. That, in turn, ensures that on-chip parts like CPU and off-chip parts like peripherals are protected from software attacks.

Trust

Performance Uplift

The Atmel SMART | SAMA5D4 processor is based on the Cortex-A5, the smallest and simplest of the Cortex-A series cores that support the 32-bit ARMv7 instruction set. It’s targeted at applications requiring high-precision computing and fast signal processing — that includes industrial and consumer applications such as control panels, communication gateways and imaging terminals.

The use cases for SAMA5D4 span from kiosks, vending machines and barcode scanners, to smart grid, communications gateways and control panels for security, home automation, thermostats, etc. Atmel’s MPU features peripherals for connectivity and user interface applications. For instance, it offers a TFT LCD controller for human-machine interface (HMI) and control panel applications and a dual Ethernet MAC for networking and gateway solutions.

Apart from providing high-grade security, SAMA5D4 adds two other crucial features to address the limitations of its predecessor, SAMA5D3 processor. First, it uplifts performance through ARM’s NEON DSP engine and 128kB L2 cache. The NEON DSP with 128-bit single instruction, multiple data (SIMD) architecture accelerates signal processing for more effective handling of multimedia and graphics. Likewise, L2 cache enhances data processing capability for imaging applications.

The second prominent feature of the SAMA5D4 is video playback that boasts 720p resolution hardware video decoder with post-image processing capability. Atmel’s embedded processor offers video playback for H.264, VP8 and MPEG4 formats at 30fps.

A Quick Overview of the SAMA5D4

The SAMA5D4 processor, which got a 14 percent performance boost from its predecessor MPU, increasing operating speed to 528 MHz, is a testament of the changing microprocessor market in the IoT arena. Atmel’s microprocessor for IoT markets delivers 840 DMIPS that can facilitate imaging-centric applications hungry for processing power. Aside from that, the SAMA5D4 is equipped with a 32-bit wide DDR controller running up to 176 MHz, which can deliver up to 1408MB/s of bandwidth. That’s a critical element for high-speed peripherals common in the industrial environments where microprocessors are required to process large amounts of data.

sama5d4-block-diagram_734x612_large

Finally, the SAMA5D4 is configurable in either a 16- or 32-bit bus interface allowing developers a trade-off between performance and memory cost. There are four distinct chips in the SAMA5D4 family: SAMA5D41 (16-bit DDR), SAMA5D42 (32-bit DDR), SAMA5D43 (16-bit DDR along with H.264 video decoder)and SAMA5D44 (32-bit DDR along with H.264 video decoder).

The SoC-specific hardware security and embedded vision capabilities are a stark reminder of specific requirements of different facets of IoT, in this case, industrial and consumers markets. And Atmel’s specific focus on security and rich media just shows how the semiconductor industry is getting around the key IoT stumbling blocks.


Majeed Ahmad is the author of books Smartphone: Mobile Revolution at the Crossroads of Communications, Computing and Consumer Electronics and The Next Web of 50 Billion Devices: Mobile Internet’s Past, Present and Future.

Secured SAMA5D4 for industrial, fitness or IoT display


To target applications like home automation, surveillance camera, control panels for security, or industrial and residential gateways, high DMIPS computing is not enough.


The new SAMA5D4 expands the Atmel | SMART Cortex-A5-based family, adding a 720p resolution hardware video decoder to target Human Machine Interface (HMI), control panel and IoT applications when high performance display capability is required. Cortex-A5 offers raw performance of 945 DMIPS (@ 600 MHz) completed by ARM NEON 128-bit SIMD (single instruction, multiple data) DSP architecture extension. To target applications like home automation, surveillance camera, control panels for security, or industrial and residential gateways, high DMIPS computing is not enough. In order to really make a difference, on top of the hardware’s dedicated video decoder (H264, VP8, MPEG4), you need the most complete set of security features.

Life-Fitness-F3-Folding-Treadmill-with-GO-Console-2_681x800

Whether for home automation purpose or industrial HMI, you want your system to be safeguarded from hackers, and protect your investment against counterfeiting. You have the option to select 16-b DDR2 interface, or 32-b if you need better performance, but security is no longer just an option. Designing with Atmel | SMART SAMA5D4 will guarantee secure boot, including ARM Trust Zone, encrypted DDR bus, tamper detection pins and secure data storage. This MPU also integrates hardware encryption engines supporting AES (Advanced Encryption Standard)/3DES (Triple Data Encryption Standard), RSA (Rivest-Shamir-Adleman), ECC (Elliptic Curves Cryptography), as well as SHA (Secure Hash Algorithm) and TRNG (True Random Number Generator).

If you design fitness equipment, such as treadmills and exercise machines, you may be more sensitive to connectivity and user interface functions than to security elements — even if it’s important to feel safe in respect with counterfeiting. Connectivity includes gigabit and 10/100 Ethernet and up to two High-Speed USB ports (configurable as two hosts or one host and one device port) and one High Speed Inter-Chip Interface (HSIC) port, several SDIO/SD/MMC, dual CAN, etc. Because the SAMA5D4 is intended to support industrial, consumer or IoT applications requiring efficient display capabilities, it integrates LCD controllers with a graphics accelerator, resistive touchscreen controller, camera interface and the aforementioned 720p 30fps video decoder.

hmi-panels-sama5d4-atmel-processor

The MCU market is highly competitive, especially when you consider that most of the products are developed around the same ARM-based family of cores (from the Cortex-M to Cortex-A5 series). Performance is an important differentiation factor, and the SAMA5D4 is the highest performing MPUs in the Atmel ARM Cortex-A5 based MPU family, offering up to 945 DMIPS (@ 600 MHz) completed by DSP extension ARM NEON 128-bit SIMD (single instruction, multiple data). Using safety and security on top of performance to augment differentiation is certainly an efficient architecture choice. As you can see in the block diagram below, the part features the ARM TrustZone system-wide approach to security, completed by advanced security features to protect the application software from counterfeiting, like encrypted DDR bus, tamper detection pins and secure data storage. But that’s not enough. Fortunately, this microprocessor integrates hardware encryption engines supporting AES/3DES, RSA, ECC, as well as SHA and TRNG.

The SAMA5 series targets industrial or fitness applications where safety is a key differentiating factor. If security helps protecting the software asset and makes the system robust against hacking, safety directly protects the user. The user can be the woman on the treadmill, or the various machines connected to the display that SAMA5 MCU pilots. This series is equipped with functions that ease the implementation of safety standards like IEC61508, including a main crystal oscillator clock with failure detector, POR (power-on reset), independent watchdog timers, write protection register, etc.

Atmel-SMART-SAMA5D4-ARM-Cortex-MPU-AtmelThe SAMA5D4 is a medium-heavier processor and well suited for IoT, control panels, HMI, and the like, differentiating from other Atmel MCUs by the means of performance and security (not to mention, safety). The ARM Cortex-A5 based device delivers up to 945 DMIPS when running at 600 MHz, completed by DSP architecture extension ARM NEON 128-bit SIMD. The most important factor that sets the SAMA5D4 apart from the rest is probably its implemented security capabilities. These will protect OEM software investments from counterfeiting, user privacy against hacking, and its safety features make the SAMA5D4 ideal for industrial, fitness or IoT applications.


This post has been republished with permission from SemiWiki.com, where Eric Esteve is a principle blogger as well as one of the four founding members of the site. This blog first appeared on SemiWiki on October 6, 2015.

6 memory considerations for Cortex-M7-based IoT designs


Taking a closer look at the configurable memory aspects of Cortex-M7 microcontrollers.


Tightly coupled memory (TCM) is a salient feature in the Cortex-M7 lineup as it boosts the MCU’s performance by offering single cycle access for the CPU and by securing the high-priority latency-critical requests from the peripherals.

Cortex-M7-chip-diagramLG

The early MCU implementations based on the ARM’s M7 embedded processor core — like Atmel’s SAM E70 and S70 chips — have arrived in the market. So it’d be worthwhile to have a closer look at the configurable memory aspects of M7 microcontrollers and see how the TCMs enable the execution of deterministic code and fast transfer of real-time data at the full processor speed.

Here are some of the key findings regarding the advanced memory architecture of Cortex-M7 microcontrollers:

1. TCM is Configurable

First and foremost, the size of TCM is configurable. TCM, which is part of the physical memory map of the MCU, supports up to 16MB of tightly coupled memory. The configurability of the ARM Cortex-M7 core allows SoC architects to integrate a range of cache sizes. So that industrial and Internet of Things product developers can determine the amount of critical code and real-time data in TCM to meet the needs of the target application.

The Atmel | SMART Cortex-M7 architecture doesn’t specify what type of memory or how much memory should be provided; instead, it leaves these decisions to designers implementing M7 in a microcontroller as a venue for differentiation. Consequently, a flexible memory system can be optimized for performance, determinism and low latency, and thus can be tuned to specific application requirements.

2. Instruction TCM

Instruction TCM or ITCM implements critical code with deterministic execution for real-time processing applications such as audio encoding/decoding, audio processing and motor control. The use of standard memory will lead to delays due to cache misses and interrupts, and therefore will hamper the deterministic timing required for real-time response and seamless audio and video performance.

The deterministic critical software routines should be loaded in a 64-bit instruction memory port (ITCM) that supports dual-issue processor architecture and provide single-cycle access for the CPU to boost MCU performance. However, developers need to carefully calibrate the amount of code that need zero-wait execution performance to determine the amount of ITCM required in an MCU device.

The anatomy of TCM inside the M7 architecture

The anatomy of TCM inside the M7 architecture.

3. Data TCM

Data TCM or DTCM is used in fast data processing tasks like 2D bar decoding and fingerprint and voice recognition. There are two data ports (DTCMs) that provide simultaneous and parallel 32-bit data accesses to real-time data. Both instruction TCM and data TCM — used for efficient access to on-chip Flash and external resources — must have the same size.

4. System RAM and TCM

System RAM, also known as general RAM, is employed for communications stacks related to networking, field buss, high-bandwidth bridging, USB, etc. It implements peripheral data buffers generally through direct memory access (DMA) engines and can be accessed by masters without CPU intervention.

Here, product developers must remember the memory access conflicts that arise from the concurrent data transfer to both CPU and DMA. So developers must set clear priorities for latency-critical requests from the peripherals and carefully plan latency-critical data transfers like the transfer of a USB descriptor or a slow data rate peripheral with a small local buffer. Access from the DMA and the caches are generally burst to consecutive addresses to optimize system performance.

It’s worth noting that while system memory is logically separate from the TCM, microcontroller suppliers like Atmel are incorporating TCM and system RAM in a single SRAM block. That lets IoT developers share general-purpose tasks while splitting TCM and system RAM functions for specific use cases.

A single SRAM block for TCM and system memory allows higher flexibility and utilization

A single SRAM block for TCM and system memory allows higher flexibility and utilization.

5. TCM Loading

The Cortex-M7 uses a scattered RAM architecture to allow the MCU to maximize performance by having a dedicated RAM part for critical tasks and data transfer. The TCM might be loaded from a number of sources, and these sources aren’t specified in the M7 architecture. It’s left to the MCU designers whether there is a single DMA or several data loading points from various streams like USB and video.

It’s imperative that, during the software build, IoT product developers identify which code segments and data blocks are allocated to the TCM. This is done by embedding programs into the software and by applying linker settings so that software build appropriately places the code in memory allocation.

6. Why SRAM?

Flash memory can be attached to a TCM interface, but the Flash cannot run at the processor clock speed and will require caching. As a result, this will cause delays when cache misses occur, threatening the deterministic value proposition of the TCM technology.

DRAM technology is a theoretical choice but it’s cost prohibitive. That leaves SRAM as a viable candidate for fast, direct and uncached TCM access. SRAM can be easily embedded on a chip and permits random accesses at the speed of the processor. However, cost-per-bit of SRAM is higher than Flash and DRAM, which means it’s critical to keep the size of the TCM limited.

Atmel | SMART Cortex-M7 MCUs

Take the case of Atmel’s SMART SAM E70, S70 and V70/71 microcontrollers that organize SRAM into four memory banks for TCM and System SRAM parts. The company has recently started shipping volume units of its SAM E70 and S70 families for the IoT and industrial markets, and claims that these MCUs provide 50 percent better performance than the closest competitor.

SAM-E70_S70_BlockDiagram_Lg_929x516

Atmel’s M7-based microcontrollers offer up to 384KB of embedded SRAM that is configurable as TCM or system memory for providing IoT designs with higher flexibility and utilization. For instance, E70 and S70 microcontrollers organize 384KB of embedded SRAM into four ports to limit memory access conflicts. These MCUs allocate 256KB of SRAM for TCM functions — 128 KB for ITCM and DTCM each — to deliver zero wait access at 300MHz processor speed, while the remaining 128KB of SRAM can be configured as system memory running at 150MHz.

However, the availability of an SRAM block organized in the form of a memory bank of 384KB means that both system SRAM and TCM can be used at the same time.The large on-chip SRAM of 384KB is also critical for many IoT devices, since it enables them to run multiple communication stacks and applications on the same MCU without adding external memory. That’s a significant value proposition in the IoT realm because avoiding external memories lowers the BOM cost, reduces the PCB footprint and eliminates the complexity in the high-speed PCB design.