Tag Archives: SRAM

How to prevent execution surprises for Cortex-M7 MCU

We know the heavy weight linked with software development, in the 60% to 70% of the overall project cost.

The ARM Cortex-A series processor core (A57, A53) is well known in the high performance market segments, like application processing for smartphone, set-top-box and networking. If you look at the electronic market, you realize that multiple applications are cost sensitive and don’t need such high performance processor core. We may call it the embedded market, even if this definition is vague. The ARM Cortex-M family has been developed to address these numerous market segments, starting with the Cortex-M0 for lowest cost, the Cortex-M3 for best power/performance balance, and the Cortex-M4 for applications requiring digital signal processing (DSP) capabilities.

For the audio, voice control, object recognition, and complex sensor fusion of automotive and higher-end Internet of Things sensing, where complex algorithms for audio and video are needed for rich audio and visual capabilities, Cortex-M7 is required. ARM offers the processor core as well as the Tightly Coupled Memory (TCM) architecture, but ARM licensees like Atmel have to implement memories in such a way that the user can take full benefit from the M7 core to meet system performance and latency goals.

Figure 1. The TCM interface provides a single 64-bit instruction port and two 32-bit data ports.

The TCM interface provides a single 64-bit instruction port and two 32-bit data ports.

In a 65nm embedded Flash process device, the Cortex-M7 can achieve a 1500 CoreMark score while running at 300 MHz, offering top class DSP performance: double-precision floating-point unit and a double-issue instruction pipeline. But algorithms like FIR, FFT or Biquad need to run as deterministically as possible for real-time response or seamless audio and video performance. How do you best select and implement the memories needed to support such performance? If you choose Flash, this will require caching (as Flash is too slow) leading to cache miss risk. Whereas SRAM technology is a better choice since it can be easily embedded on-chip and permits random access at the speed of processor.

Peripheral data buffers implemented in general-purpose system SRAM are typically loaded by DMA transfers from system peripherals. The ability to load from a number of possible sources, however, raises the possibility of unnecessary delays and conflicts by multiple DMAs trying to access the memory at the same time. In a typical example, we might have three different entities vying for DMA access to the SRAM: the processor (64-bit access, requesting 128 bits for this example) and two separate peripheral DMA requests (DMA0 and DMA1, 32-bit access each). Atmel has get round this issue by organizing the SRAM into several banks as described in this picture:

Figure 2. By organizing the SRAM into banks, multiple DMA bursts can occur simultaneously with minimal latency.

By organizing the SRAM into banks, multiple DMA bursts can occur simultaneously with minimal latency.

For a chip maker designing microcontrollers, licensing ARM Cortex-M processor core provides numerous advantages. The very first is the ubiquity of the ARM core architecture, being adopted in multiple market segments to support variety of applications. If this chip maker wants to design-in a new customer, the probability that such OEM has already used ARM-based MCU is very high, and it’s very important for this OEM to be able to reuse existing code (we know the heavy weight linked with software development, in the 60% to 70% of the overall project cost). But this ubiquity generates a challenge: how do you differentiate from the competition when competitors can license exactly the same processor core?

Selecting a more aggressive technology node and providing better performance at lower cost are an option, but we understand that this advantage can disappear as soon as the competition also move to this node. Integrating larger amount of Flash is another option, which is very efficient if the product is designed on a technology that enables it to keep the pricing low enough.

If the chip maker has designed on an aggressive technology node for higher performance and offers a larger amount of Flash than the competition, it may be enough differentiation. Completing with the design of a smarter memory architecture unencumbered by cache misses, interrupts, context swaps, and other execution surprises that work against deterministic timing allow bringing strong differentiation.


If you want to more completely understand how Atmel has designed this SMART memory architecture for the Cortex-M7, I encourage you to read this white paper from Jacko Wilbrink and Lionel Perdigon entitled “Run Blazingly Fast Algorithms with Cortex-M7 Tightly Coupled Memories.” (You will have to register.) This paper describes MCUs integrating SRAM organized into four banks that can be used as general SRAM and for TCM, showing one example of a Cortex-M7 MCU being implemented in the Atmel | SMART SAM S70, SAM E70 and SAM V70/V71 families.

This post has been republished with permission from SemiWiki.com, where Eric Esteve is a principle blogger, as well as one of the four founding members of the site. This blog was originally shared on August 6, 2015.

Why do drones love the Atmel SAM E70?

Eric Esteve explains why the latest Cortex-M7 MCU series will open up countless capabilities for drones other than just flying. 

By nature, avionics is a mature market requiring the use of validated system solution: safety is an absolute requirement, while innovative systems require a stringent qualification phase. That’s why the very fast adoption of drones as an alternative solution for human piloted planes is impressive. It took 10 or so years for drones to become widely developed and employed for various applications, ranging from war to entertainment, with prices spanning a hundreds of dollars to several hundreds of thousands. But, even if we consider consumer-oriented, inexpensive drones, the required processing capabilities not only call for high performance but versatile MCU as well, capable of managing its built-in gyroscope, accelerator, geomagnetic sensor, GPS, rotational station, four to six-axis control, optical flow and so on.


When I was designing for avionics, namely the electronic CFM56 motor control (this reactor being jointly developed by GE in the U.S. and Snecma in France, equipping Boeing and Airbus planes), the CPU was a multi-hundred dollar Motorola 68020, leading to a $20 per MIPS cost! While I may not know the Atmel | SMART SAM E70 price precisely — I would guess that it cost a few dollars — what I do I know is that the MCU is offering an excess of 600 DMIPS. Aside from its high performance, this series boasts a rather large on-chip memory size of up to 384KB SRAM and 2MB Flash — just one of many pivotal reasons that this MCU has been selected to support the “drone with integrated navigation control to avoid obstacle and improve stability.”

In fact, the key design requirements for this application were: +600 DMIPS, camera sensor interface, dual ADC and PWM for motor control and dual CAN, all bundled up in a small package. Looking at the block diagram below helps link the MCU features with the various application capabilities: gyroscope (SPI), accelerator (SPI x2), geomagnetic sensor (I2C x2), GPS (UART), one or two-channel rotational station (UART x2), four or six-axis control communication (CAN x2), voltage/current (ADC), analog sensor (ADC), optical flow sensor (through image sensor Interface or ISI) and pulse width modulation (PWM x8) to support the rotational station and four or six-axis speed PWM control.

For those of you who may not know, the SAM E70 is based on the ARM-Cortex M7 — a principle and multi-verse handling MCU that combines superior performance with extensive peripheral sets supporting multi-threaded processes. It’s this multi-thread support that will surely open up countless capabilities for drones other than simply flying.

Atmel | SMART ARM Cortex M7 SAM E70

Today’s drones already possess the ability to soar through the air or stay stationary, snapping pictures or capturing HD footage. That’s already very impressive to see sub-kilogram devices offering such capabilities! However, the drone market is already looking ahead, preparing for the future, with the desire to get more application stacks into the UAVs so they can take in automation, routing, cloud connectivity (when available), 4G/5G, and other wireless functionalities to enhance data pulling and posting.

For instance, imagine a small town tallying a few thousand habitants, except a couple of days or weeks per year because of a special event or holiday, a hundred thousand people come storming into the area. These folks want to feed their smartphone with multimedia or share live experiences by sending movies or photos, most of them at the same time. The 4G/5G and cloud infrastructure is not tailored for such an amount of people, so the communication system may break. Yet, this problem could be fixed by simply calling in drone backup to reinforce the communication infrastructure for that period of time.

While this may be just one example of what could be achieved with the advanced usage of drones, each of the innovative applications will be characterized by a common set of requirements: high processing performance, large SRAM and flash memory capability, and extensive peripheral sets supporting multi-threaded processes. In this case, the Cortex M7 ARM-based SAM E70 MCU is an ideal choice with processing power in excess of 640 DMIPS, large on-chip SRAM (up to 384 KB) and Flash (up to 2MB) capabilities managing all sorts of sensors, navigation, automation, servos, motor, routing, adjustments, video/audio and more.

Intrigued? You’ll want to check out some of the products and design kits below:

This post has been republished with permission from SemiWiki.com, where Eric Esteve is a principle blogger as well as one of the four founding members of SemiWiki.com. This blog first appeared on SemiWiki on July 18, 2015.

New smart energy solutions @ European Utility Week

Today at the European Utility Week Conference, Atmel debuted its new and comprehensive smart energy platform designed specifically for smart grid communications, electricity, gas and water metering systems and energy measurement applications.

According to Kourosh Boutorabi, Atmel’s Sr. Director of Smart Energy Products, the Atmel SAM4Cx platform includes several system-on-chip (SoC) devices built around a dual-core ARM Cortex-M4 architecture with advanced security, metrology, wireless and power-line communications (PLC) options.


“The unique and highly flexible platform addresses OEM’s system partitioning, bill of materials (BOM) and time-to-market requirements with the widest range of integration and performance optimization options available in the market today,” Boutorabi explained.

“Flexibility to address a new and diverse set of smart grid communications and metrology standards with low power system-on-chip solutions are crucial requirements for OEMs targeting high-volume deployments. We are excited that Atmel’s industry leading technologies address OEM requirements as a new and innovative multi-layered platform.”


Indeed, key features of Atmel’s smart energy platform include best-in-class metrology with class 0.2 accuracy and dynamic range of up to 6000:1 for single and poly-phase applications; low-power PRIME PLC connectivity with integrated line driver; advanced cryptography; the ability to integrate application, communication and metrology; up to 2Mbytes of embedded Flash and 304Kbytes of SRAM with external memory expansion option. Additional specs include low-power RTC, LCD and anti-tamper feature sets designed to reduce smart meter BOM by as much as 40 percent.

Interested in learning more about Atmel’s new and comprehensive smart energy platform? Be sure to check out our official product page here.

Simply AVR: 8-bit ideas with Atmel

Vegard Wollan, co-inventor of AVR microcontroller (MCU) architecture, says AVR “was born from the combination of advanced computer science coupled with proven Flash memory manufacturing techniques.”

Indeed, AVR architecture offers both engineers and Makers robust performance, low power, high-speed, connectivity and easy system integration. Based on a single-cycle RISC engine that deftly combines a rich instruction set, AVR MCUs are capable of delivering close to 1 MIPS (Million Instructions Per Second) per megahertz – as they are optimized for minimum code size and maximum computing performance.

Perhaps most importantly, Atmel makes it possible to create smaller footprint designs, as our AVR MCUs offer a high level of integration with on-chip Flash, SRAM, EEPROM, pull-up resistors, precision oscillator, watchdog timer, brownout detector and GPIO/PWM (pulse-width modulation) pins for application use. Advanced on-chip analog capabilities include an internal temperature sensor, analog comparators, multiple 10-bit and 12-bit ADC (analog-to-digital converter) input channels and a programmable-gain analog amplifier.

On the low power side, Atmel has developed picoPower technology, which enables AVR microcontrollers to reduce power consumption in both sleep and active mode, thereby achieving the industry’s lowest power consumption with 500nA @ 1.8V with RTC running and 100nA with full SRAM retention.

In terms of software, AVR MCUs are designed with ease of use in mind, from peripherals to datasheets to tools. To be sure, we offer a high-quality, easy-to-use tool chain for the full range of our AVR families. Available for free, Atmel Studio enables code development in C or Assembly by providing cycle-accurate simulation – and integrating seamlessly with AVR starter kits, programmers, debuggers, evaluation kits and reference designs.

This makes AVR microcontrollers ideal for a broad range of applications including industrial control, ZigBee and RF, medical and utility metering, communication gateways, sensor control, white goods and portable battery-powered products. Last, but certainly not least, both Makers and developers can benefit from a robust community following of over 300,000 engineers, with AVR Freaks offering a centralized location where participants frequently interact with each other in various AVR MCU forums.

32-bit AVR MCUs for automotive applications (Part 2)

In the first part of this series, we took a closer look at how Atmel’s AVR low-power 32-bit microcontrollers (MCUs) help enable the implementation of various product-differentiating features, including advanced control algorithms, voice control and capacitive touch sensing.

We also discussed powering Atmel’s AVR UC3C 32-bit automotive-grade microcontrollers with either a 3.3V or a 5V supply (generally supporting 5V I/O). This has been achieved by moving to a modified 0.18-micron process technology, which supports higher I/O voltage levels in a reliable and cost-effective manner without any complex and expensive voltage conversion. In addition to supporting 5V I/O, the UC3C has been designed to support a wide range of high-performance peripherals required by automotive applications, including:

  • ADC: 16 channels with 12-bit resolution at up to 1.5M samples/second; dual sample and hold capabilities; built-in calibration; internal and external reference voltages.
  • DAC:  Four outputs (2 x 2 channels) with 12-bit resolution; up to 1M sample/second conversion rate with 1us settling time; flexible conversion range; one continuous or two sample/hold outputs per channel.
  • Analog comparator:  Four channels with selectable power vs. speed; selectable hysteresis (0.20mV and 50mV); flexible input selections and interrupts; window compare function by combining two comparators.
  • Timer/Counter: multiple clock sources (five internal and three external); rich feature set (counter, capture, up/down, PWM); two input/output signals per channel; global start control for synchronized operation.
  • Quadrature decoder: Integrated decoder supports direct motor rotation detection.
  • Multiple interfaces: includes a two-channel, two-wire interface (TWI), master/slave SPI, and full-featured USART that can be used as an SPI or LIN.
  • Fully integrated USB:  built-in USB 2.0 transceivers support low (1.5Mbps), full (12Mbps) and on-the-go modes; included in the AVR Software Framework are production-ready drivers for various USB devices (mass storage, HID, CDC, audio), hosts (mass storage, HID, CDC) and combined function devices.

Atmel’s AVR UC3C 32-bit automotive-grade microcontrollers are also designed to achieve higher system throughput with our Peripheral Event System.

“Managing peripherals by the CPU can become a major system bottleneck, especially as the number of peripherals and their operating frequencies increase. With high sampling rates across multiple channels, interrupt overhead and data processing can consume a large percentage of the processor’s available clock cycles,” an Atmel engineering rep told Bits & Pieces. “If the CPU load needs to manage a single SPI port even at a low data rate of 1.2Mbps, this would require 53% of the processor’s capacity. In addition, the interrupt latency increases and introduces jitter.”

And that is why AVR UC3C architecture utilizes Atmel’s peripheral event system, which allows CPU-independent handling of inter-peripheral signaling through an internal communication fabric that interconnects all peripherals. Rather than triggering an interrupt to tell the CPU to read a peripheral or port, the peripheral instead manages itself by directly transferring data to the SRAM for storage – all without requiring any action by the CPU.

“From a power perspective, only those blocks that are part of the conversion are active. The CPU is free to execute application code or conserve power in idle mode during the entire event,” the Atmel engineering rep continued. “In addition, the peripheral event controller allows a more deterministic response compared to a CPU-based, interruptdriven event controller, because the latency is fixed to 3 cycles, i.e., 33ns when operating at 66MHz. This enables precise timing of events without jitter, resulting in constant sample rates for ADCs and DACs.”

Interested in learning more about 32-bit AVR MCUs for automotive applications? Be sure to check out part three of this series which details how Atmel MCUs can be used to help protect IP and bolster system safety. Interested in learning more about 32-bit AVR MCUs for automotive applications? Be sure to check out part onetwothree and four of this series.

Capacity and performance characterize Atmel’s megaAVR

Our ongoing coverage of Atmel’s comprehensive AVR portfolio has taken readers on a detailed MCU (microcontroller) tour this month. First, Bits & Pieces dove into the guts of Atmel’s AVR UC3 which is built around high-performance 32-bit AVR architecture and optimized for highly integrated applications.

We then spent some time with Atmel’s AVR XMEGA, an MCU designed for real-time performance, high integration and ultra-low power. And today we want to properly acquaint our readers with Atmel’s megaAVR microcontroller, which is well known for both capacity and performance.

“When your designs need some extra muscle, you need the megaAVR. Ideal for applications requiring large amounts of code, the megaAVR offers substantial program and data memories with performance up to 20 MIPS, with picoPower technology minimizing power consumption,” an Atmel engineering rep told Bits & Pieces. “All megaAVRs offer self-programmability for fast, secure, cost-effective in-circuit upgrades. You can even upgrade the flash while running your application.”

Indeed, the megaAVR family offers Atmel’s widest selection of devices in terms of memories, pin counts and peripherals. Meaning, engineers can choose from general-purpose devices to models with specialized peripherals like USB, or LCD controllers, or CAN, LIN and Power Stage Controllers.

More specifically, Atmel’s megaAVR family is equipped with on-chip flash, SRAM, internal EEPROM, SPI, TWI, USART, USB, CAN, LIN, watchdog timer, a choice of internal or external precision oscillator and general purpose I/O pins.

In terms of analog functions, the megaAVR boasts advanced analog capabilities, such as ADC, DAC, built-in temperature sensor and internal voltage reference, brown out detector, a fast analog comparator and a programmable analog gain amplifier. Simply put, the high level of integration allows designs with fewer external analog components.

And last, but certainly not least, megaAVR microcontrollers help accelerate the development process with advanced in-system programming and on-chip debug, while in-system programming works to simplify production line programming and field upgrades.

Interested in learning more? A full breakdown of our AVR portfolio is available here.

A closer look at Atmel’s AVR CPU

Atmel’s 8- and 32-bit AVR CPUs are based on advanced Harvard architecture – which is perhaps best known for neatly balancing power consumption with performance.

Like every Harvard architecture device, the AVR CPU is equipped with two busses: one instruction bus where the CPU reads executable instructions; and a second data bus to read or write the corresponding data.

“This ensures that a new instruction can be executed in every clock cycle, which eliminates wait states when no instruction is ready to be executed,” an Atmel engineering rep told Bits & Pieces. “The busses in AVR microcontrollers are configured to provide the CPU instruction bus priority access to the on-chip Flash memory. The CPU data bus has priority access to the SRAM.”

To make the AVR instruction set as efficient as possible, Atmel engineers invited compiler experts from IAR Systems to co-develop the first AVR C compiler. Following extensive refinement, the AVR architecture became optimized for C-code execution, with bottlenecks completely eliminated during the construction phase. This is why AVR has become synonymous with small code size, high performance and low power consumption.

“Usually, when the CPU executes a program, it requires frequent access to a limited set of data, including pointers, loop counters, semaphore status bits and array indexes. In fact, close inspection of source code will reveal that most of the data is only required for a very short amount of time, then later discarded,” the engineering rep explained. “That is why the AVR CPU contains multiple ‘working registers,’ which store dynamic data inside the CPU. Organized in a ‘register file,’ they eliminate the need to move temporary data from CPU to SRAM – only to read it back a few cycles later.”

To be sure, the register file is extremely fast, allowing the CPU to read, execute and store the result back into a register in a single clock cycle. They also require far less energy when accessed, compared to accessing a large SRAM with long address and data lines. Because no cycles are wasted, power consumption for executing code is greatly reduced.

In terms of DSP Instructions, the 32-bit AVR contains a very wide instruction set – with integer, fixed point and floating point DSP instructions – giving it the highest CPU performance of any AVR CPU.

“The 32-bit AVR instruction set also includes saturation and rounding instructions that help speed up loops by requiring no internal range check of intermediate results,” the engineering rep added. “With fast multiply, accumulate, and divide instructions, the 32-bit AVR is the perfect choice for applications that require extensive digital signal processing.”

Interested in learning more about Atmel’s 8- and 32-bit AVR portfolio? Check out our official product page here.

An MCU or MPU, that is the question: Part 1

Selecting the most appropriate device (an MCU or MPU) for a new project or design can be somewhat daunting. Indeed, engineers typically analyze a wide range of variables, including price, performance and power consumption.

To make the process easier, we will examine some of the primary differences between an MCU (microcontroller) and MPU (microprocessor).

“Typically, an MCU uses on-chip embedded Flash memory in which to store and execute its program,” Frédéric Gaillard, product marketing manager and Andreas Eieland, senior product marketing manager, told Bits & Pieces.

“Storing the program in this way means that the MCU has a very short start-up period and can be executing code very quickly. The only practical limitation to using embedded memory is that the total available memory space is finite. Indeed, most Flash MCU devices available on the market have a maximum of two Mbytes of program memory and, depending on the application, this could prove to be a limiting factor.”

In contrast, MPUs are not limited by memory constraints in quite the same way, as they employ external memory to provide program and data storage. The program – typically stored in non-volatile memory such as NAND or serial Flash – and is loaded into an external DRAM at start-up and subsequently commences execution. On a practical level, this means the MPU will not be up and running as quickly as an MCU, although the amount of DRAM and NVM engineers can connect to the processor is in the range of hundreds of Mbytes and even Gbytes for NAND.

Another notable difference between MPUs and MCUs is power consumption methodology. By embedding its own power supply, an MCU is fine with just one single voltage power rail. However, an MPU typically requires several different voltage rails, prompting the use of additional on-board power ICs/converters. And while MPUs do have low power modes there are not as many or as low as the ones you would find on a typical MCU.

“With the external hardware supporting an MPU has an added factor, putting an MPU into a low power mode might also be slightly more complex,” the two explained. “In addition, the actual consumption of an MCU is magnitudes lower than an MPU, in low power mode for example with SRAM and register retention, you can consider a factor 10 to 100. Obviously this is directly related to the amount of RAM an operating system requires and therefore to be powered to resume operation instantaneously.”

Clearly, design specs are critical when it comes time for an engineer to select an appropriate device for a specific application. For example, is the number of MCU peripheral interface channels sufficient? Do marketing specifications stipulate a user interface (UI) capability that is simply impossible with an MCU due to a lack of on-chip memory and performance?

“When embarking on the first design engineers know it is highly likely there will be many product variations,” said Gaillard and Eieland. “As such, it is very possible a platform-based design approach will be preferred. This would stipulate more ‘headroom’ in terms of processing power and interface capabilities in order to accommodate future feature upgrades.”

Want to learn more about the differences between MPUs and MCUs? Stay tuned to Bits & Pieces for part 2 of “An MCU or MPU, that is the question.”