Tag Archives: DSP performance

ARM Keil ecosystem integrates the Atmel SAM ESV7


Keil is part of the ARM wide ecosystem, enabling developers to speed up system release to the market. 


Even the best System-on-Chip (SoC) is useless without software, as well as the best designed S/W needs H/W to flourish. The “old” embedded world has exploded into many emergent markets like the  IoT, wearables, and even automotive, which is no more restricted to motor control or airbags as innovative products from entertainment to ADAS are being developed. What is the common denominator with these emergent products? Each of these require more software functionality and fast memory algorithm with deterministic code execution, and consequently innovative hardware to support these requirements, such as the ARM Cortex-M7-based Atmel | SMART SAM ESV7.

AtmelChipLib Overview

ARM has released a complete software development environment for a range of ARM Cortex-M based MCU devices: Keil MDK. Keil is part of ARM wide ecosystem, enabling developers to speed up system release to the market. MDK includes the µVision IDE/Debugger and ARM C/C++ Compiler, along with the essential middleware components and software packs. If you’re familiar with Run-Time Environment stacked description, you’ll recognize the various stacks. Let’s focus on “CMSIS-Driver”. CMSIS is the standard software framework for Cortex-M MCUs, extending the SAM-ESV7 Chip Library with standardized drivers for middleware and generic component interfaces.

By definition, an MCU is designed to address multiple applications and the SAM ESV7 is dedicated to support performance demanding and DSP intensive systems. Thanks to its 300MHz clock, SAM ESV7 delivers up to 640 DMIPS and its DSP performance is double that available in the Cortex-M4. A double-precision floating-point unit and a double-issue instruction pipeline further position the Cortex-M7 for speed.

Atmel Cortex M7 based Dev board

Let’s review some of these applications where SAM ESV7 is the best choice…

Finger Printer Module

The goal is to provide human bio authentication module for office or house access control. The key design requirements are:

  • +300 MHz CPU performance to process recognition algorithms
  • Image sensor interface to read raw finger image data from finger sensor array
  • Low cost and smaller module size
  • Flash/memory to reduce BOM cost and module size
  • Memory interface to expand model with memory extension just in case.

The requirement for superior performance and an image sensor interface can be seen as essential needs, but which will make the difference will be to offer both cheaper BOM cost and smaller module size than the competitor? The SAM S70 integrates up to 2MB embedded Flash, which is twice more than the direct competitor and may allow reducing BOM and module size.

SAM S70 Finger Print

Automotive Radio System

Every cent counts in automotive design, and OEMs prefer using a MCU rather than MPU, at first for cost reasons. Building an attractive radio for tomorrow’s car requires developing very performing DSP algorithms. Such algorithms used to be developed on expansive DSP standard part, leading to large module size, including external Flash and MCU leading obviously to a heavy BOM. In a 65nm embedded Flash process device, the Cortex-M7 can achieve a 1500 CoreMark score while running at 300 MHz, and its DSP performance is double that available in the Cortex-M4. This DSP power can be used to manage eight channels of speaker processing, including six stages of biquads, delay, scaler, limiter and mute functions. The SAM S71 workload is only 63% of the CPU, leaving enough room to support Ethernet AVB stack — very popular in automotive.

One of the secret sauces of the Cortex-M7 architecture is to provide a way to bypass the standard execution mechanism using “tightly coupled memories,” or TCM. There is an excellent white paper describing TCM implementation in the SAM S70/E70 series, entitled “Run Blazingly Fast Algorithms with Cortex-M7 Tightly Coupled Memories” from Lionel Perdigon and Jacko Wilbrink, which you can find here.


This post has been republished with permission from SemiWiki.com, where Eric Esteve is a principle blogger as well as one of the four founding members of the site. This blog first appeared on SemiWiki on October 23, 2015.

How to prevent execution surprises for Cortex-M7 MCU


We know the heavy weight linked with software development, in the 60% to 70% of the overall project cost.


The ARM Cortex-A series processor core (A57, A53) is well known in the high performance market segments, like application processing for smartphone, set-top-box and networking. If you look at the electronic market, you realize that multiple applications are cost sensitive and don’t need such high performance processor core. We may call it the embedded market, even if this definition is vague. The ARM Cortex-M family has been developed to address these numerous market segments, starting with the Cortex-M0 for lowest cost, the Cortex-M3 for best power/performance balance, and the Cortex-M4 for applications requiring digital signal processing (DSP) capabilities.

For the audio, voice control, object recognition, and complex sensor fusion of automotive and higher-end Internet of Things sensing, where complex algorithms for audio and video are needed for rich audio and visual capabilities, Cortex-M7 is required. ARM offers the processor core as well as the Tightly Coupled Memory (TCM) architecture, but ARM licensees like Atmel have to implement memories in such a way that the user can take full benefit from the M7 core to meet system performance and latency goals.

Figure 1. The TCM interface provides a single 64-bit instruction port and two 32-bit data ports.

The TCM interface provides a single 64-bit instruction port and two 32-bit data ports.

In a 65nm embedded Flash process device, the Cortex-M7 can achieve a 1500 CoreMark score while running at 300 MHz, offering top class DSP performance: double-precision floating-point unit and a double-issue instruction pipeline. But algorithms like FIR, FFT or Biquad need to run as deterministically as possible for real-time response or seamless audio and video performance. How do you best select and implement the memories needed to support such performance? If you choose Flash, this will require caching (as Flash is too slow) leading to cache miss risk. Whereas SRAM technology is a better choice since it can be easily embedded on-chip and permits random access at the speed of processor.

Peripheral data buffers implemented in general-purpose system SRAM are typically loaded by DMA transfers from system peripherals. The ability to load from a number of possible sources, however, raises the possibility of unnecessary delays and conflicts by multiple DMAs trying to access the memory at the same time. In a typical example, we might have three different entities vying for DMA access to the SRAM: the processor (64-bit access, requesting 128 bits for this example) and two separate peripheral DMA requests (DMA0 and DMA1, 32-bit access each). Atmel has get round this issue by organizing the SRAM into several banks as described in this picture:

Figure 2. By organizing the SRAM into banks, multiple DMA bursts can occur simultaneously with minimal latency.

By organizing the SRAM into banks, multiple DMA bursts can occur simultaneously with minimal latency.

For a chip maker designing microcontrollers, licensing ARM Cortex-M processor core provides numerous advantages. The very first is the ubiquity of the ARM core architecture, being adopted in multiple market segments to support variety of applications. If this chip maker wants to design-in a new customer, the probability that such OEM has already used ARM-based MCU is very high, and it’s very important for this OEM to be able to reuse existing code (we know the heavy weight linked with software development, in the 60% to 70% of the overall project cost). But this ubiquity generates a challenge: how do you differentiate from the competition when competitors can license exactly the same processor core?

Selecting a more aggressive technology node and providing better performance at lower cost are an option, but we understand that this advantage can disappear as soon as the competition also move to this node. Integrating larger amount of Flash is another option, which is very efficient if the product is designed on a technology that enables it to keep the pricing low enough.

If the chip maker has designed on an aggressive technology node for higher performance and offers a larger amount of Flash than the competition, it may be enough differentiation. Completing with the design of a smarter memory architecture unencumbered by cache misses, interrupts, context swaps, and other execution surprises that work against deterministic timing allow bringing strong differentiation.

Pic

If you want to more completely understand how Atmel has designed this SMART memory architecture for the Cortex-M7, I encourage you to read this white paper from Jacko Wilbrink and Lionel Perdigon entitled “Run Blazingly Fast Algorithms with Cortex-M7 Tightly Coupled Memories.” (You will have to register.) This paper describes MCUs integrating SRAM organized into four banks that can be used as general SRAM and for TCM, showing one example of a Cortex-M7 MCU being implemented in the Atmel | SMART SAM S70, SAM E70 and SAM V70/V71 families.


This post has been republished with permission from SemiWiki.com, where Eric Esteve is a principle blogger, as well as one of the four founding members of the site. This blog was originally shared on August 6, 2015.

3 design hooks of Atmel MCUs for connected cars


The MPU and MCU worlds are constantly converging and colliding, and the difference between them is not a mere on-off switch — it’s more of a sliding bar. 


In February 2015, BMW reported that it patched the security flaw which could allow hackers to remotely unlock the doors of more than 2 million BMW, Mini and Rolls-Royce vehicles. Earlier, researchers at ADAC, a German motorist association, had demonstrated how they could intercept communications with BMW’s ConnectedDrive telematics service and unlock the doors.

security-needs-for-connected-car-by-atmel

BMW uses SIM card installed in the car to connect to a smartphone app over the Internet. Here, the ADAC researchers created a fake mobile network and tricked nearby cars into taking commands by reverse engineering the BMW’s telematics software.

The BMW hacking episode was a rude awakening for the connected car movement. The fact that prominent features like advanced driver assistance systems (ADAS) are all about safety and security is also a testament is that secure connectivity will be a prime consideration for the Internet of Cars.

Built-in Security

Atmel is confident that it can establish secure connections for the vehicles by merging its security expertise with performance and low-power gains of ARM Cortex-M7 microcontrollers. The San Jose, California-based chip supplier claims to have launched the industry’s first auto-qualified M7-based MCUs with Ethernet AVB and media LB peripherals. In addition, this high-end MCU series for in-vehicle infotainment offers the CAN 2.0 and CAN flexible data rate controller for higher bandwidth requirements.

Nicolas Schieli, Automotive MCU Marketing Director at Atmel, acknowledges that security is something new in the automotive environment that needs to be tackled as cars become more connected. “Anything can connect to the controller area network (CAN) data links.”

Schieli notes that the Cotex-M7 has embedded enhanced security features within its architecture and scalability. On top of that, Atmel is using its years of expertise in Trusted Platform Modules and crypto memories to securely connect cars to the Internet, not to mention the on-chip SHA and AES crypto engines in SAM E70/V70/V71 microcontrollers for encryption of data streams. “These built-in security features accelerate authentication of both firmware and applications.”

Crypto

Schieli notes that the Cotex-M7 has embedded enhanced security features within its architecture and scalability. On top of that, Atmel is using its years of expertise in Trusted Platform Modules and crypto memories to securely connect cars to the Internet, not to mention the on-chip SHA and AES crypto engines in SAM E70/V70/V71 microcontrollers for encryption of data streams. “These built-in security features accelerate authentication of both firmware and applications.”

He explained how the access to the Flash, SRAM, core registers and internal peripherals is blocked to enable security. It’s done either through the SW-DP/JTAG-DP interface or the Fast Flash Programming Interface. The automotive-qualified SAM V70 and V71 microcontrollers support Ethernet AVB and Media LB standards, and they are targeted for in-vehicle infotainment connectivity, audio amplifiers, telematics and head control units companion devices.

Software Support

The second major advantage that Atmel boasts in the connected car environment is software expertise and an ecosystem to support infotainment applications. For instance, a complete automotive Ethernet Audio Video Bridging (AVB) stack is being ported to the SAM V71 microcontrollers.

Software support is a key leverage in highly fragmented markets like automotive electronics. Atmel’s software package encompasses peripheral drivers, open-source middleware and real-time operating system (RTOS) features. The middleware features include USB class drivers, Ethernet stacks, storage file systems and JPEG encoder and decoder.

Next, the company offers support for several RTOS platforms like RTX, embOS, Thread-X, FreeRTOS and NuttX. Atmel also facilitates the software porting of any proprietary or commercial RTOS and middleware. Moreover, the MCU supplier from San Jose features support for specific automotive software such as AUTOSAR and Ethernet AVB stacks.

Atmel supports IDEs such as IAR or ARM MDK and Atmel Studio and it provides a full-featured board that covers all MCU series, including E70, V70 and V71 devices. And, a single board can cover all Atmel microcontrollers. Moreover, the MCU supplier provides Board Support Package for Xplained evaluation kit and easy porting to customer boards through board definition file (board.h).

Beyond that, Atmel is packing more functionality and software features into its M7 microcontrollers. Take SAM V71 devices, for example, which have three software-selectable low-power modes: sleep, wait and backup. In sleep mode, the processor is stopped while all other functions can be kept running. While in wait mode, all clocks and functions are stopped but some peripherals can be configured to wake up the system based on predefined conditions. In backup mode, RTT, RTC and wake-up logic are running. Furthermore, the microcontroller can meet the most stringent key-off requirements while retaining 1Kbyte of SRAM and wake-up on CAN.

Transition from MPU to MCU

Cortex-M7 is pushing the microcontroller performance in the realm of microprocessors. MPUs, which boast memory management unit and can run operating systems like Linux, eventually lead to higher memory costs. “Automakers and systems integrators are increasingly challenged in getting performance point breakthrough because they are running out of Flash capacity,” explained Schieli.

On the other hand, automotive OEMs are trying to squeeze costs in order to bring the connected car riches to non-luxury vehicles, and here M7 microcontrollers can help bring down costs and improve the simplification of car connectivity.

The M7 microcontrollers enable automotive embedded systems without the requirement of a Linux head and can target applications with high performance while running RTOS or bare metal implementation. In other words, M7 opens up avenues for automotive OEMs if they want to make a transition from MPU to MCU for cost benefits.

However, the MPU and MCU worlds are constantly converging and colliding, and the difference between them is not a mere on-off switch. It’s more of a sliding bar. Atmel, having worked on both sides of the fence, can help hardware developers to manage that sliding bar well. “Atmel is using M7 architecture to help bridge the gap between microprocessors and high-end MCUs,” Schieli concludes.


Majeed Ahmad is the author of books Smartphone: Mobile Revolution at the Crossroads of Communications, Computing and Consumer Electronics and The Next Web of 50 Billion Devices: Mobile Internet’s Past, Present and Future.

Atmel tightens automotive focus with new Cortex-M7 MCUs


Large SoCs without an Ethernet interface typically have slow start-up times and high-power requirements — until now. 


Atmel, a lead partner for the ARM Cortex-M7 processor launch in October 2014, has unveiled three new M7-based microcontrollers with a unique memory architecture and advanced connectivity features for the connected car market.

According to a company spokesman, E70, V71 and V70 chips are the industry’s highest performing Cortex-M microcontrollers with six-stage dual-issue pipeline delivering 1500 CoreMarks at 300MHz. Moreover, V70 and V71 microcontrollers are the only automotive-qualified ARM Cortex-M7 MCUs with Audio Video Bridging (AVB) over Ethernet and Media LB peripheral support.

Cortex-M7-chip-diagramLG

Atmel is among the first suppliers to introduce the ARM Cortex-M7-based MCUs, whose core combines performance and simplicity and further pushes the performance envelope for embedded devices. The new MCU devices are aimed to take the connected car design to the next performance level with high-speed connectivity, high-density on-chip memory, and a solid ecosystem of design engineering tools.

Atmel’s Memory Play

Atmel has memory technology in its DNA, and that seems apparent in the design footprint of E70, V70 and V71 MCUs. The San Jose-based chipmaker is offering a flexible memory system that is optimized for performance, determinism and low latency.

Jacko Wilbrink, Senior Marketing Director at Atmel, said that the company’s Cortex-M7-based MCUs leverage Atmel’s advanced peripherals and flexible SRAM architecture for higher performance applications while keeping the Cortex-M class ease-of-use. He added that the large on-chip SRAM on SAM E70/V70/V71 chips is critical for connected car and IoT product designers since it allows them to run the multiple communication stacks and applications on the same MCU without adding external memory.

On-chip DMA and low-latency access SRAM architecture

On-chip DMA and low-latency access SRAM architecture

Avoiding the external memories reduces the PCB footprint, lowers the BOM cost and eliminates the complexity of high-speed PCB design when pushing the performance to a maximum. Next, Tim Grai, another senior manager at Atmel, pointed out another critical take from Cortex-M7 designs: The tightly coupled memory (TCM) interface. It provides the low-latency memory that the processor can use without the unpredictability that is a feature of cache memories.

Grai says that the most vital memory feature is not the memory itself but how the TCM interface to the M7 is utilized. “The available RAM is configurable to be used as system RAM or tightly-coupled instruction and data memory to the core, where it provides deterministic zero-wait state access,” Grai added. “The arrangement of SRAM allows for multiple concurrent accesses.”

Cortex-M7 a DSP Winner

According to Will Strauss, President & Principal Analyst at Forward Concepts, ARM has had considerable success with its Cortex-M4 power-efficient 32-bit processor chip family. “However, realizing that it lacked the math ability to do more sophisticated DSP functions, ARM has introduced the Cortex-M7, its newest and most powerful member of the Cortex-M family.”

Strauss adds that the M7 provides 32-bit floating point DSP capability as well as faster execution times. With the greater clock speed, floating point and twice the DSP power of the M4, the M7 is even more attractive for applications requiring high-performance audio and even video accompanying traditional automotive and control applications.

Atmel’s Grai added an interesting dimension to the DSP story in Cortex-M7 processor fabric. He pointed out that true DSPs don’t do control and logical functions well and generally lack the breadth of peripherals available on MCUs. “The attraction of the M7 is that it does both—DSP functions and control functions—hence it can be classified as a digital signal controller (DSC).”

Grai quoted the example of Atmel V70 and V71 microcontrollers used to connect end-nodes like infotainment audio amplifiers to the emerging Ethernet AVB network. In an audio amplifier, you receive a specific audio format that has to be converted, filtered, modulated to match the requirement for each specific speaker in the car. So you need Ethernet and DSP capabilities at the same time.

Grai says that the audio amplifier in infotainment applications is a good example of DSC: a mix of MCU capabilities and peripherals plus DSP capability for audio processing. Atmel is targeting the V70 and V71 chips as a bridge between large application processors and Ethernet.

Most of the time, the main processor does not integrate Ethernet AVB, as the infotainment connectivity is based on Ethernet standard. Here, the V71 microcontroller brings this feature to the main processor. “Large SoCs, which usually don’t have Ethernet interface, have slow start-up time and high power requirements,” Grai said. “Atmel’s V7x MCUs allow fast network start-up and facilitate power moding.”

The SAM E70, V70 and V71

Atmel’s three new MCU devices are aimed at multiple aspects of in-vehicle infotainment connectivity and telematics control.

SAM E70: The microcontroller series features Dual CAN-FD, 10/100 Ethernet MAC with IEEE1588 real-time stamping, and AVB support. It’s aimed at automotive industry’s movement toward controller area network (CAN) message-based protocols holistically across the cabin, eliminating isolation and wire redundancy, and have them all bridged centrally with the CAN interface.

SAM V70: It’s designed for MediaLB connectivity and leverages advanced audio processing, multi-port memory architecture and Cortex-M7 DSP capabilities. For the media-oriented systems transport (MOST) architecture, old modules are not redesigned. So Atmel offers a MOST solution that is done over Media Local Bus (MediaLB) and is supported by the V70 series.

SAM V71: The MCU series ports a complete automotive Ethernet AVB stack for in-vehicle infotainment connectivity, audio amplifiers, telematics and head control units. It mirrors the SAM V70 series features as well as combines Ethernet-AVB and MediaLB connectivity stacks.


Majeed Ahmad is the author of books Smartphone: Mobile Revolution at the Crossroads of Communications, Computing and Consumer Electronics and The Next Web of 50 Billion Devices: Mobile Internet’s Past, Present and Future.