Designing Low-Energy Embedded Systems from Silicon to Software chip one stop

新闻中心

Designing Low-Energy Embedded Systems from Silicon to Software

2012/10/04Silicon Laboratories 处理器/存储器

Introduction
Low-energy system design requires attention to non-traditional factors ranging from the silicon process technology to the software that runs on microcontroller-based embedded platforms. Closer examination at the system level reveals three key parameters that determine the energy efficiency of a microcontroller (MCU): active power consumption, standby power consumption and the duty cycle, which determines the ratio of time spent in either state and is itself determined by the behavior of the software.

A low-energy standby state can make an MCU seem extremely energy efficient, but its true performance is evident only after taking into account all of the factors governing active power consumption. For this and other reasons, the tradeoffs of process technology, IC architecture and software construction are some of the many decisions with subtle and sometimes unexpected outcomes. The manner in which functional blocks on a microcontroller are combined has a dramatic impact on overall energy efficiency. Even seemingly small and subtle changes to hardware implementation can result in large swings in overall energy consumption over the lifetime of a system.

Low-Energy Applications
Metering and alarm systems, for example, are often powered for 10 years by a single battery. A small increase in current consumption for a sensor reading (of which hundreds of millions may occur over the lifetime of the product) can result in years being lost from the product’s actual in-field lifetime. A simple smoke alarm that detects the presence of smoke particles in the air once a second will take 315 million readings during its lifespan.

The activity ratio or duty cycle of a simple smoke alarm is relatively low. Each sensor reading may take no more than a few hundred microseconds to complete, and much of that time is spent in calibration and settling as the microcontroller wakes up the analog-to-digital converters (ADCs) and other sensitive analog elements and allows them to reach a stable point of operation. In this case, the duty cycle is likely to lead to a design that is inactive approximately 99.98 percent of the time.

A traditional smoke alarm is comparatively simple. Consider a more complex RF design in which results are relayed over a sensor mesh to a host application. The sensor needs to listen for activity from a master node so that it can either signal that it is still present within the mesh network or provide recently-captured information to the router. However, this increased activity may not affect the overall duty cycle. Instead, more functions may be performed during each activation period using a higher-performance device. Because of its increased processing speed (made possible by a more advanced architecture and semiconductor technology), the faster device can provide greater energy efficiency than a slower device running for more cycles. The key lies in understanding the interaction between process technology, MCU architecture and software implementation.

Part One: Silicon Choices

CMOS Energy Profile
Practically all MCUs are implemented using a CMOS technology. The power consumption of any active logic circuitry is given by the formula CV2f where: C is the total capacitance of the switching circuit paths within the device, V is the supply voltage, and f is the operating frequency. (See Figure 1.) The voltage and capacitance are factors of the underlying process technology. Over the past three decades, the on-chip operating voltage of CMOS logic has fallen from 12 V to less than 2 V as transistors have scaled down in size. Because voltage is a quadratic function in the active-power equation, the use of lower voltages has a significant impact.

Figure 1. CMOS Logic Structure and Energy Consumption during Switching
Although the capacitance term is linear, the factors that lead to reductions in its overall level are also assisted greatly by Moore’s Law scaling. A more recent process will, for a given logical function, offer lower capacitance than its predecessors and, with it, lower power consumption. In addition, advanced design techniques make it possible to reduce the overall switching frequency by only operating circuits with actual work to perform, a technique known as clock gating.

Compared to other technologies, CMOS dramatically reduces wasted energy; however, leakage current remains. In contrast to active power consumption, the leakage increases with Moore’s Law scaling and needs to be taken into account in any low-energy application because of the proportion of time that a low duty-cycle system is inactive. However, as with active power consumption, circuit design has a dramatic impact on real-world leakage. Analogous to clock gating, power gating can greatly ameliorate the effects of leakage and make more advanced process nodes better choices for low duty-cycle systems, even though an older process technology may offer a lower theoretical leakage figure.

Appropriate Process Technology
There is an appropriate process technology for every feature set. The answer is not to simply rely on one process technology that has the lowest theoretical leakage just because the device will spend a long time in sleep mode. During sleep mode, it is possible to disable power to large segments of the MCU, taking the leakage component out of the equation. Leakage is a bigger issue when circuits are active, but can easily be outweighed by the advantages of more advanced transistors that switch far more efficiently.

As an example, the leakage current of a 90 nm process versus that of a dedicated low-power 180 nm process is approximately five times higher. The active mode power consumption is a factor of four lower, but this is based on a far larger figure. Take a 180 nm microcontroller with an active current consumption of 40 mA and a deep-sleep mode consumption of 60 nA and compare these power levels to those of a 90Silicon Laboratories, Inc.nm implementation that is able to drive the active current draw down to 10 mA but suffers from a higher sleep mode current of 300 nA.

In the previous example, the MCU must be active for 0.0008 percent of the time for the 90 nm implementation to be more energy efficient overall. In other words, if the system is active one second per day, the 90 nm version is approximately 1.5 times as energy efficient as its 180 nm counterpart. The conclusion is that it is important to understand the application duty cycle when selecting a process geometry. (See Figure 2.)

Once the appropriate process technology has been selected, the IC designer has the option to further optimize energy performance. When first introduced, the concept of clock gating was applied at a relatively coarse level. Clock gating increases the complexity of a design because the circuit designer needs to be aware of which logic paths will require a clock signal at any given time.

Clock Distribution
Most microcontroller implementations use a hierarchical structure to distribute clock signals and the appropriate voltage levels to each part of the IC. The functional units, such as instruction processing blocks and peripherals, are organized into groups. Each of these groups will be fed by a separate clock tree and power network. The clock signal for each group is derived from a common clock source by a frequency divider or multiplier. Similarly, the voltage delivered to each group of peripherals will be controlled by a set of power transistors and voltage regulators if the groups require different voltages (an approach that is becoming increasingly common).

To minimize design complexity, MCUs have used a relatively simple clock-gating scheme in which entire clock trees are disabled as long as no functional units inside a group are active; however, this allows logic that is performing no useful work to be clocked in groups that are active. For example, the adder unit in a CPU core might receive a clock even if the current instruction is a branch. The switching triggered by the clock signal within that adder increases power consumption by a factor of CV2f, as described earlier.

Improvements in design tools and techniques have made it possible to increase the granularity of clock gating to the point where no peripheral or functional unit receives a clock signal if it has no work to perform during that cycle.

Voltage scaling provides further potential energy savings by making it possible to deliver a lower voltage to a particular group of functional units when required. The key to delivering the appropriate voltage to a group of functional units or peripherals lies in the implementation of on-chip voltage regulators or dc-dc converters and the use of monitoring circuits to ensure that the IC operates at the required voltage.

Power-Supply Considerations
On-chip voltage regulators provide the system designer with greater flexibility, making it possible to extract more charge from a battery. For example, an on-chip switching buck converter (like the ones found in Silicon Labs’ SiM3L1xx series products) can be used to take the 3.6 V of an industrial battery and convert it to 1.2 V at more than 80 percent efficiency. Many MCUs do not have this feature and use linear components to drop the voltage to the right level with a greater degree of waste. In advanced implementations, the buck regulator can be switched off when the battery has discharged to such a level that it no longer makes sense to perform the conversion. As a result, the power supply can be optimized for energy efficiency over the lifetime of the device, all under software control.

Part Two: Software Decisions

Performance Scaling
Implementing energy-efficient embedded applications relies on software design that uses hardware resources in the most appropriate way. What is appropriate depends not only on the application but also on the hardware implementation. Likewise, the more flexible the hardware in terms of CPU, clock, voltage and memory usage, the greater the potential energy savings the developer can achieve. Hardware-aware software tools provide the embedded systems engineer with greater awareness of what further savings are achievable.

One option is to employ dynamic voltage scaling, as shown in Figures 3 and 4. This technique is made possible by on-chip dc-dc converters and performance-monitoring circuits, which provide the ability to reduce the supply voltage when the application does not need to execute instructions at the highest speed. Under these conditions, the system operates with reduced power consumption. The benefits that can be achieved are a function of input voltage and can vary over the life of a product. The following figures show the relative differences between no voltage scaling (VDD fixed), static voltage scaling (SVS) and active voltage scaling (AVS).

An interesting artifact of AVS is that the AVS strategy can change depending on the input voltage to the system. In this example, when the input voltage is 3.6 V, it is more efficient to power the internal logic as well as the flash memory from a high-efficiency internal dc-dc converter. However, as the input voltage falls (i.e., battery discharge over product life cycle), it is more efficient to power the flash memory subsystem from the input voltage directly because the internal logic can operate at lower voltages than the memory. For example, the new SiM3L1xx low power 32-bit microcontroller family from Silicon Labs has a flexible power architecture with six separate and variable power domains that enables this kind of dynamic optimization.

Typically, CMOS logic circuits will operate more slowly as their voltage is reduced. If the application can tolerate lower performance (often the case when dealing with communications protocols that demand data be delivered no faster than a certain standardized frequency), then the quadratic reduction in energy consumption with lower voltage can provide large energy savings. Leakage provides a lower limit on voltage scaling. If each operation takes too long, leakage will begin to dominate the energy equation and increase overall energy consumption. For this reason, it can make sense to execute a function as quickly as possible and then put the processor into sleep mode to minimize the leakage component.

Take, for example, a wireless sensor application that needs to perform a significant amount of digital signal processing (DSP), such as a glass-breakage detector. In this example, the application uses a Fast Fourier Transform (FFT) to analyze the vibrations picked up by an audio sensor for the characteristic frequencies generated by glass shattering. The FFT is relatively complex, so executing it at a lower frequency governed by a reduced voltage is likely to increase leakage substantially, even in older process technologies. The best approach, in this case, is to execute at near maximum frequency and then return to sleep until the time comes to report any findings to a host node.

The wireless protocol code, however, imposes different requirements. Radio protocols have fixed timings for events. In these cases, the protocols can be handled entirely in hardware. It makes more sense to reduce the processor core’s voltage. Therefore, the code needed for packet assembly and transmission runs at a speed appropriate to the wireless protocol.

The addition of hardware blocks such as intelligent direct memory access (DMA) can further change the energy trade-offs. Many DMA controllers, such as the one provided by the native ARM® Cortex™-M3 processor, require frequent intervention from the processor. However, more intelligent DMA controllers that support a combination of sequencing and chaining allow the processor to compute packet headers, encrypt data, assemble packets, and then hand over the work of passing the packets at appropriate intervals to the memory buffers used by the radio front-end. For much of the time that the radio link is active, the processor can sleep, saving a significant amount of energy.

Memory Usage
With modern 32-bit microcontroller devices, software engineers have a high degree of freedom in the way memory blocks are used. Typically, the MCU will provide a mixture of non-volatile flash memory for long-term code and data storage along with static random access memory (SRAM) to hold temporary data. In most cases, the power consumption of flash memory accesses will be higher than those made to SRAM. In the normal usage case, flash memory reads exceed SRAM reads by a factor of three. Flash memory writes, which require entire blocks to be erased and then rewritten using a lengthy sequence of relatively high-voltage pulses, consume even more power; however, for most applications, flash write operations are infrequent and do not materially affect the average power consumption.

A further factor in flash-memory power consumption is how accesses from the processor are distributed. Within each block of flash memory there are multiple pages, each of which can be up to 4 kB in size. To support any accesses, each page has to be powered-up; any unused pages can be maintained in a low-power state.

If a regularly accessed section of code straddles two flash pages rather than being contained within one, the energy associated with instruction reads will increase. Reallocating memory to place frequentlySilicon Laboratories, Inc.accessed sections of code and data within discrete pages can result in sizeable savings in power consumption over the lifetime of a battery charge with no changes to the physical hardware.

It often makes sense to copy functions that are used more frequently into on-chip SRAM and read their instructions from there rather than from flash, even though this appears to use the memory capacity less efficiently. The benefit in battery life can easily make up for the slightly higher memory consumption.

Code Optimization
Energy optimization can also upend traditional ideas of code efficiency. For decades, embedded systems engineers have focused on optimizing code for memory size except when performance is critical. Energy optimization provides an altogether new set of metrics. An important consideration is usage of the on-chip cache that is generally available to 32-bit microcontroller platforms.

Optimization for code size enables retention of more of the executable in cache, which improves both speed and energy consumption. However, function calls and branches that are used to reduce the size of the application through the reuse of common code can result in unintended conflicts between sections of the code for the same line in the cache. This can result in wasteful ‘cache-thrashing’ as well as multiple flash page activations when the instructions need to be fetched from main memory.

For code runs frequently during the lifetime of the product, it makes sense for it to be sufficiently compact to fit into the cache but not to branch or call functions. Consider a smoke alarm; even if the alarm triggers once a week (perhaps from excess smoke caused by activity in the kitchen), that is only 520 events out of 315 million during the alarm’s 10-year life. The vast majority of the time, the code only takes a sensor reading, finds that the threshold has not been exceeded and then puts the processor core back to sleep until it is woken by the system timer.

Out of all the sensor readings that the alarm takes, less than 0.0002 percent will result in the execution of alarm-generating code. The remaining 99.9998 percent of code execution will be of the core sensor-reading loop. Ensuring that this code is run in a straight line out of cache can be the key to minimizing energy usage. Because it runs so infrequently, the remaining code can be optimized using more traditional techniques.

Tools for Energy Efficiency
Tool support is vital for maximizing the energy efficiency of an MCU platform. The ability to allocate functions to discrete pages of flash memory requires a linker that understands the detailed memory map of each target microcontroller. The linker can take developer input on whether blocks can be allowed to cross page boundaries and generate a binary that is optimized for the most energy-efficient use of non-volatile storage. In principle, this code is also used to ensure that functions and data are placed in such a way that the most commonly executed ones do not clash over cache lines. This level of detail can be achieved much more easily when the tools are provided by the MCU vendor (who knows the memory layout and power requirements of each target platform). This is far more difficult for a third-party vendor to achieve.

The MCU also has a detailed understanding of how the different peripherals and on-chip buses are organized. This knowledge can be applied in tools to guide the engineer in making choices that do not waste power. The graphical AppBuilder environment developed by Silicon Labs is one such example, as shown in Figure 5. This tool makes it possible to define the framework for an application by dragging and dropping peripherals onto a canvas.

AppBuilder can look at the peripheral setup and determine whether energy-saving changes are possible. For example, if a user has pulled a UART into the application and set its speed to 9600 baud, the tool will view the peripheral bus of the UART and determine the appropriate setting. The ARM Peripheral Bus (APB) used to host blocks, such as UARTs and analog-to-digital converters, can run at up to 50 MHz. Inthis instance, this speed is far higher (and will consume more energy) than is necessary; so, the tool asks if the user wants to reduce the APB’s data rate to a level that is more appropriate.

In addition, AppBuilder software provides the engineer with other application-specific information on power consumption. Using a simulation of the target MCU (again made possible by a detailed understanding of the silicon features), the tool can provide an interactive histogram of estimated current not just for the entire application but for the processor and each peripheral.

Development tools will evolve to become more “power-aware.” Traditionally, debug features such as breakpoints have been set on events (i.e., memory reads and writes). In the future, it is conceivable that breakpoint support will evolve to handle power-related issues. For example, if power consumption at a particular point or the integrated energy since the last sleep state exceeds a target, the debugger will trigger and show which parts of the application are consuming higher-than-expected amounts of power (e.g., code that straddles a flash-page boundary may be running more frequently than expected). Higher-than-expected consumption and information on the code’s position in the memory map provide vital clues to help the software engineer take appropriate action.

Conclusion
Low-energy system design is a holistic process that is enabled by choosing a combination of the right silicon, software and development tools. By mastering the relationship between each of these variables, systems engineers can develop higher performance and more energy-efficient embedded systems that stretch the limits of battery-powered applications.

Silicon Labs invests in research and development to help our customers differentiate in the market with innovative low-power, small size, analog intensive mixed-signal solutions. Silicon Labs' extensive patent portfolio is a testament to our unique approach and world-class engineering team. Patent: www.silabs.com/patent-notice

c 2012, Silicon Laboratories Inc. ClockBuilder, DSPLL, Ember, EZMac, EZRadio, EZRadioPRO, EZLink, ISOmodem, Precision32, ProSLIC, QuickSense, Silicon Laboratories and the Silicon Labs logo are trademarks or registered trademarks of Silicon Laboratories Inc. ARM and Cortex-M3 are trademarks or registered trademarks of ARM Holdings. ZigBee is a registered trademark of ZigBee Alliance, Inc. All other product or service names are the property of their respective owners.

企业HP：
http://www.silabs.com

Silicon Laboratories点击这里购买产品里

Silicon Laboratories新闻发布

C8051F97x Low-Power Capacitive Sensing MCU
2014/08/07Silicon Laboratories 微控制器
Silicon Labs | I2C humidity and temperature sensor
2013/12/17Silicon Laboratories 传感器

日	一	二	三	四	五	六
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	1	2	3	4	5

日	一	二	三	四	五	六
29	30	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31	1	2

・可以识别多个型号的搜索

・无法确定型号（规格确定）也可以搜索

新闻中心