Skip to content

Commit ccdde5f

Browse files
Re-implement PWM generator logic (#7231)
* Re-implement PWM generator logic Add special-purpose PWM logic to preserve alignment of PWM signals for things like RGB LEDs. Keep a sorted list of GPIO changes in memory. At time 0 of the PWM cycle, set all pins to high. As time progresses bring down the additional pins as their duty cycle runs out. This way all PWM signals are time aligned by construction. This also reduces the number of PWM interrupts by up to 50%. Before, both the rising and falling edge of a PWM pin required an interrupt (and could shift arround accordingly). Now, a single IRQ sets all PWM rising edges (so 1 no matter how many PWM pins) and individual interrupts generate the falling edges. The code favors duty cycle accuracy over PWM period accuracy (since PWM is simulating an analog voltage it's the %age of time high that's the critical factor in most apps, not the refresh rate). Measurements give it about 35% less total error over full range at 20khz than master. @me-no-dev used something very similar in the original PWM generator. * Adjust running PWM when analogWriteFreq changed Use fixed point math to adjust running PWM channels to the new frequency. * Also preserve phase of running tone/waveforms Copy over full high/low periods only on the falling edge of a cycle, ensuring phase alignment for Tone and Servo. * Clean up signed/unsigned mismatch, 160MHz operat'n * Turn off PWM on a Tone or digitalWrite Ensure both the general purpose waveform generator and the PWM generator are disabled on a pin used for Tone/digitalWrite. * Remove hump due to fixed IRQ delta A hump in the dueling PWMs was very prominent in prior pulls. The hump was caused by having a PWM falling edge just before the cycle restart, while having the other channel requesting a 1->0 transition just outside the busy-loop window of 10us. So it gets an IRQ for channel B 0->1, then waits 2..8us for the next PWM full cycle 0->1, and ends up returning from interrupt and not scheduling another IRQ for 10us...hence the horizontal leg of the bump... Reduce the minimum IRQ latency a little bit to minimize this effect. There will still be a (significantly smaller) hump when things cross, but it won't be anywhere near as bad or detectable. * Speed PWM generator by reordering data struct Breaking out bitfields required a load and an AND, slowing things down in the PWM loop. Convert the bitfield into two separate natural-sized arrays to reduce code size and increase accuracy. * Remove if() that could never evaluate TRUE * Add error feedback to waveform generation Apply an error term to generated waveform phase times to adjust for any other ongoing processes/waveforms. Take the actual edge generation times, subtract them from the desired, and add 1/4 of that (to dampen any potential oscillations) to the next similar phase of that waveform. Allows the waveform to seek its proper period and duty cycle without hardcoding any specific calibrations (which would change depending on the codepaths, compiler options, etc.) in the source. * Move _stopPWM and _removePWMEntry to IRAM Thanks to @dok-net for noticing these need to be in IRAM as they may be called by digitalWrites in an IRQ. * Avoid long wait times when PWM freq is low * Fix bug where tone/pwm could happen on same pin * Adjust for random 160MHZ operation The WiFi stack sometimes changes frequency behind our backs, so ESP's cycle counter does not count constant ticks. We can't know how long it's been at a different than expected frequency, so do the next best thing and make sure we adjust any ESP cycles we're waiting for by the current CPU speed. This can lead to a blip in the waveform for 1 period when the frequency toggles from normal, and when it toggles back, but it should remain for the intervening periods. Should avoid a lot of LED shimmering and servo errors during WiFi connection (and maybe transmission). * Clean up leftover debugs in ISR * Subtract constant-time overhead for PWM, add 60khz PWM has a constant minimum time between loops with a single pin, so pull that time out of the desired PWM period and shift the center of the PWM frequency closer to the desired without any dynamic feedback needed. Enable 60khz PWM, even though it's not terribly useful as it causes an IRQ every ~8us (and each IRQ is 2-3us). The core can still run w/o WDT, but it's performance is about 5x slower than unloaded. * Fix GPIO16 not toggling properly. * Remove constant offset to PWM period analogWrite doesn't know about the change in total PWM cycles, so it is possible for it to send in a value that's beyond the maximum adjusted PWM cycle count, royally messing up things. Remove the offset. Also, fix bug with timer callback functions potentially disabling the timer if PWM was still active. * Remove volatiles, replace with explicit membarrier Volatiles are expensive in flash/IRAM as well as in runtime because they introduce `memw` instructions everywhere their values are used. Remove the volatiles and manually mark handshake signals for re-read/flush to reduce code and runtime in the waveform generator/PWM. * Consolidate data into single structure Save IRAM and flash by using a class to hold waveform generator state. Allows for bast+offset addressing to be used in many cases, removing `l32r` and literals from the assembly code. * Factor out common timer shutdown code * Remove unneeded extra copy on PWM start * Factor out common edge work in waveform loop * Factor out waveform phase feedback loop math * Reduce PWM size by using 32b count, indexes Byte-wide operations require extra instructions, so make index and count a full 32-bits wide. * GP16O is a 1-bit register, just write to it Testing indicates that GP16O is just a simple 1-bit wide register in the RTC module. Instead of |= and &- (i.e. RmW), use direct assignment in PWM generator. * Increase PWM linearity in low/high regions By adjusting the PWM cycle slightly to account for the fixed time through the compute loop, increase the linear response near the min and max areas. * Remove redundant GetCycleCount (non-IRQ) * Factor out common timer setup operations * Fix clean-waveform transition, lock to tone faster New startWaveform waveforms were being copied over on the falling edge of the cycle, not the rising edge. Everything else is based on rising edge, so adjust accordingly. Also, feedback a larger % of the error term in standard waveform generation. Balances the speed at which it locks to tones under changing circumstances with it not going completely bonkers when a transient error occurs due to some other bit. * Reduce IRAM by pushing more work to _setPWM Simply mark pins as inactive, don't adjust the ordered list until the next _startPWM call (in IROM). * Fix typo in PWM pin 1->0 transition Actually check the pin mask is active before setting the PWM pin low. D'oh. * Combine cleanup and pin remove, save 50 bytes IROM The cleanup (where marked-off pins are removed from the PWM time map) and remove (where a chosen pin is taken out of the PWM map) do essentially the same processing. Combine them and save ~50 bytes of code and speed things up a tiny bit. * Remove unused analogMap, toneMap Save ~100 bytes of IROM by removing the tone/analog pin tracking from the interface functions. They were completely unused. * Save IRAM/heap by adjusting WVF update struct The waveform update structure included 2 32-bit quantities (so, used 8 * 17 = 136 bytes of RAM) for the next cycle of a waveform. Replace that with a single update register, in a posted fashion. The logic now sets the new state of a single waveform and returns immediately (so, no need to wait 1ms if you've got an existing waveform of 1khz). The waveform NMI will pick up the changed value on its next cycle. Reduces IRAM by 40 bytes, and heap by 144 bytes. * Don't duplicate PWM period calculation Let the waveform generator be the single source of truth for the PWM period in clock cycles. Reduces IRAM by 32 bytes and makes things generally saner. * Factor out common PWM update code Replace repeated PWM update logic with a subroutine, and move the PWMUpdate pointer into the state itself. Reduces IROM and IRAM, removes code duplication. Also remove single-use macros and ifdef configurable options as the IRAM and IROM impact of them are now not very large. * Fix regression when analogWrite done cold Lost an `initTimer()` call in a refactoring, resulting in the core hanging forever while waiting for the NMI which will never happen. Re-add as appropriate. * Save 16b of IRAM by not re-setting edge intr bit Per @dok-net, drop the rewrite of the edge trigger flag in the timer interrupt register. It's set on startup and never cleared, so this is redundant. Drops ~16 bytes of IRAM. * Allow on-the-fly PWM frequency changes When PWM is running and analogWriteFreq is called, re-calculate the entire set of PWM pins to the new frequency. Preserve the raw numerator/denominator in an unused bit of the waveform structure to avoid wasting memory. * Adjust for fixed overhead on PWM period Pulls the actual PWM period closer to the requested one with a simple, 0-overhead static adjustment. * Fix value reversal when analogWrite out of range Silly mistake, swapped high and low values when checking analogWrite for over/under values. Fixed * Don't optimize the satopWaveform call Save a few bytes of IRAM by not using -O2 on the stopWaveform call. It is not a speed-critical function. * Avoid side effects in addPWMtoList * Adjust PWM period as fcn of # of PWM pins Results in much closer PWM frequency range over any number of PWM pins, while taking 0 add'l overhead in IRAM or in the IRQ. * Fix occasional Tone artifacts When _setPWMFreq was called the initial PWM mask was not set to 0 leading to occasional issues where non-PWM pins would be set to 1 on the nextPWM cycle. Manifested itself as an overtone at the PWM frequency +/-. * Reduce CPU usage and enhance low range PWM output Borrow a trick from #7022 to exit the busy loop when the next event is too far out. Also reduce the IRQ delta subtraction because it was initially not NMI so there was much more variation than now. Keep the PWM state machine active at a higher prio than the standard tone generation when the next edge is very close (i.e. when we're at the max or min of the range and have 2 or more near edges). Adds a lot of resolution to the response at low and high ranges. Go from relative to absolute cycle counts in the main IRQ loop so that we don't mingle delta-cycles when the delta start was significantly different. * Update min IRQ time to remove humps in PWM linearity Keep PWM error <2.0% on entire range, from 0-100%, and remove the hump seen in testC by fixing the min IRQ delay setting. * Remove minor bump at high PWM frequencies The IRQ lead time was a tiny bit undersized, causing IRQs to come back too late for about .25us worth of PWM range. Adjust the constant accordingly
1 parent 5931583 commit ccdde5f

13 files changed

+929
-169
lines changed

boards.txt

+68-68
Large diffs are not rendered by default.

cores/esp8266/Tone.cpp

+7
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,13 @@ static void _startTone(uint8_t _pin, uint32_t high, uint32_t low, uint32_t durat
3030
return;
3131
}
3232

33+
#ifndef WAVEFORM_LOCKED_PHASE
34+
// Stop any analogWrites (PWM) because they are a different generator
35+
_stopPWM(_pin);
36+
#endif
37+
// If there's another Tone or startWaveform on this pin
38+
// it will be changed on-the-fly (no need to stop it)
39+
3340
pinMode(_pin, OUTPUT);
3441

3542
high = std::max(high, (uint32_t)microsecondsToClockCycles(25)); // new 20KHz maximum tone frequency,

cores/esp8266/core_esp8266_waveform.h

+4-90
Original file line numberDiff line numberDiff line change
@@ -1,93 +1,7 @@
1-
/*
2-
esp8266_waveform - General purpose waveform generation and control,
3-
supporting outputs on all pins in parallel.
4-
5-
Copyright (c) 2018 Earle F. Philhower, III. All rights reserved.
6-
Copyright (c) 2020 Dirk O. Kaar.
7-
8-
The core idea is to have a programmable waveform generator with a unique
9-
high and low period (defined in microseconds or CPU clock cycles). TIMER1 is
10-
set to 1-shot mode and is always loaded with the time until the next edge
11-
of any live waveforms.
12-
13-
Up to one waveform generator per pin supported.
14-
15-
Each waveform generator is synchronized to the ESP clock cycle counter, not the
16-
timer. This allows for removing interrupt jitter and delay as the counter
17-
always increments once per 80MHz clock. Changes to a waveform are
18-
contiguous and only take effect on the next waveform transition,
19-
allowing for smooth transitions.
20-
21-
This replaces older tone(), analogWrite(), and the Servo classes.
22-
23-
Everywhere in the code where "ccy" or "ccys" is used, it means ESP.getCycleCount()
24-
clock cycle count, or an interval measured in CPU clock cycles, but not TIMER1
25-
cycles (which may be 2 CPU clock cycles @ 160MHz).
26-
27-
This library is free software; you can redistribute it and/or
28-
modify it under the terms of the GNU Lesser General Public
29-
License as published by the Free Software Foundation; either
30-
version 2.1 of the License, or (at your option) any later version.
31-
32-
This library is distributed in the hope that it will be useful,
33-
but WITHOUT ANY WARRANTY; without even the implied warranty of
34-
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
35-
Lesser General Public License for more details.
36-
37-
You should have received a copy of the GNU Lesser General Public
38-
License along with this library; if not, write to the Free Software
39-
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
40-
*/
1+
// Wrapper to include both versions of the waveform generator
412

423
#ifdef WAVEFORM_LOCKED_PHASE
43-
44-
#include <Arduino.h>
45-
46-
#ifndef __ESP8266_WAVEFORM_H
47-
#define __ESP8266_WAVEFORM_H
48-
49-
#ifdef __cplusplus
50-
extern "C" {
4+
#include "core_esp8266_waveform_phase.h"
5+
#else
6+
#include "core_esp8266_waveform_pwm.h"
517
#endif
52-
53-
// Start or change a waveform of the specified high and low times on specific pin.
54-
// If runtimeUS > 0 then automatically stop it after that many usecs, relative to the next
55-
// full period.
56-
// If waveform is not yet started on pin, and on pin == alignPhase a waveform is running,
57-
// the new waveform is started at phaseOffsetUS phase offset, in microseconds, to that.
58-
// Setting autoPwm to true allows the wave generator to maintain PWM duty to idle cycle ratio
59-
// under load, for applications where frequency or duty cycle must not change, leave false.
60-
// Returns true or false on success or failure.
61-
int startWaveform(uint8_t pin, uint32_t timeHighUS, uint32_t timeLowUS,
62-
uint32_t runTimeUS = 0, int8_t alignPhase = -1, uint32_t phaseOffsetUS = 0, bool autoPwm = false);
63-
// Start or change a waveform of the specified high and low CPU clock cycles on specific pin.
64-
// If runtimeCycles > 0 then automatically stop it after that many CPU clock cycles, relative to the next
65-
// full period.
66-
// If waveform is not yet started on pin, and on pin == alignPhase a waveform is running,
67-
// the new waveform is started at phaseOffsetCcys phase offset, in CPU clock cycles, to that.
68-
// Setting autoPwm to true allows the wave generator to maintain PWM duty to idle cycle ratio
69-
// under load, for applications where frequency or duty cycle must not change, leave false.
70-
// Returns true or false on success or failure.
71-
int startWaveformClockCycles(uint8_t pin, uint32_t timeHighCcys, uint32_t timeLowCcys,
72-
uint32_t runTimeCcys = 0, int8_t alignPhase = -1, uint32_t phaseOffsetCcys = 0, bool autoPwm = false);
73-
// Stop a waveform, if any, on the specified pin.
74-
// Returns true or false on success or failure.
75-
int stopWaveform(uint8_t pin);
76-
77-
// Add a callback function to be called on *EVERY* timer1 trigger. The
78-
// callback returns the number of microseconds until the next desired call.
79-
// However, since it is called every timer1 interrupt, it may be called
80-
// again before this period. It should therefore use the ESP Cycle Counter
81-
// to determine whether or not to perform an operation.
82-
// Pass in NULL to disable the callback and, if no other waveforms being
83-
// generated, stop the timer as well.
84-
// Make sure the CB function has the ICACHE_RAM_ATTR decorator.
85-
void setTimer1Callback(uint32_t (*fn)());
86-
87-
#ifdef __cplusplus
88-
}
89-
#endif
90-
91-
#endif // __ESP8266_WAVEFORM_H
92-
93-
#endif // WAVEFORM_LOCKED_PHASE

cores/esp8266/core_esp8266_waveform.cpp renamed to cores/esp8266/core_esp8266_waveform_phase.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141

4242
#ifdef WAVEFORM_LOCKED_PHASE
4343

44-
#include "core_esp8266_waveform.h"
44+
#include "core_esp8266_waveform_phase.h"
4545
#include <Arduino.h>
4646
#include "ets_sys.h"
4747
#include <atomic>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
/*
2+
esp8266_waveform - General purpose waveform generation and control,
3+
supporting outputs on all pins in parallel.
4+
5+
Copyright (c) 2018 Earle F. Philhower, III. All rights reserved.
6+
Copyright (c) 2020 Dirk O. Kaar.
7+
8+
The core idea is to have a programmable waveform generator with a unique
9+
high and low period (defined in microseconds or CPU clock cycles). TIMER1 is
10+
set to 1-shot mode and is always loaded with the time until the next edge
11+
of any live waveforms.
12+
13+
Up to one waveform generator per pin supported.
14+
15+
Each waveform generator is synchronized to the ESP clock cycle counter, not the
16+
timer. This allows for removing interrupt jitter and delay as the counter
17+
always increments once per 80MHz clock. Changes to a waveform are
18+
contiguous and only take effect on the next waveform transition,
19+
allowing for smooth transitions.
20+
21+
This replaces older tone(), analogWrite(), and the Servo classes.
22+
23+
Everywhere in the code where "ccy" or "ccys" is used, it means ESP.getCycleCount()
24+
clock cycle count, or an interval measured in CPU clock cycles, but not TIMER1
25+
cycles (which may be 2 CPU clock cycles @ 160MHz).
26+
27+
This library is free software; you can redistribute it and/or
28+
modify it under the terms of the GNU Lesser General Public
29+
License as published by the Free Software Foundation; either
30+
version 2.1 of the License, or (at your option) any later version.
31+
32+
This library is distributed in the hope that it will be useful,
33+
but WITHOUT ANY WARRANTY; without even the implied warranty of
34+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
35+
Lesser General Public License for more details.
36+
37+
You should have received a copy of the GNU Lesser General Public
38+
License along with this library; if not, write to the Free Software
39+
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
40+
*/
41+
42+
#ifdef WAVEFORM_LOCKED_PHASE
43+
44+
#include <Arduino.h>
45+
46+
#ifndef __ESP8266_WAVEFORM_H
47+
#define __ESP8266_WAVEFORM_H
48+
49+
#ifdef __cplusplus
50+
extern "C" {
51+
#endif
52+
53+
// Start or change a waveform of the specified high and low times on specific pin.
54+
// If runtimeUS > 0 then automatically stop it after that many usecs, relative to the next
55+
// full period.
56+
// If waveform is not yet started on pin, and on pin == alignPhase a waveform is running,
57+
// the new waveform is started at phaseOffsetUS phase offset, in microseconds, to that.
58+
// Setting autoPwm to true allows the wave generator to maintain PWM duty to idle cycle ratio
59+
// under load, for applications where frequency or duty cycle must not change, leave false.
60+
// Returns true or false on success or failure.
61+
int startWaveform(uint8_t pin, uint32_t timeHighUS, uint32_t timeLowUS,
62+
uint32_t runTimeUS = 0, int8_t alignPhase = -1, uint32_t phaseOffsetUS = 0, bool autoPwm = false);
63+
// Start or change a waveform of the specified high and low CPU clock cycles on specific pin.
64+
// If runtimeCycles > 0 then automatically stop it after that many CPU clock cycles, relative to the next
65+
// full period.
66+
// If waveform is not yet started on pin, and on pin == alignPhase a waveform is running,
67+
// the new waveform is started at phaseOffsetCcys phase offset, in CPU clock cycles, to that.
68+
// Setting autoPwm to true allows the wave generator to maintain PWM duty to idle cycle ratio
69+
// under load, for applications where frequency or duty cycle must not change, leave false.
70+
// Returns true or false on success or failure.
71+
int startWaveformClockCycles(uint8_t pin, uint32_t timeHighCcys, uint32_t timeLowCcys,
72+
uint32_t runTimeCcys = 0, int8_t alignPhase = -1, uint32_t phaseOffsetCcys = 0, bool autoPwm = false);
73+
// Stop a waveform, if any, on the specified pin.
74+
// Returns true or false on success or failure.
75+
int stopWaveform(uint8_t pin);
76+
77+
// Add a callback function to be called on *EVERY* timer1 trigger. The
78+
// callback returns the number of microseconds until the next desired call.
79+
// However, since it is called every timer1 interrupt, it may be called
80+
// again before this period. It should therefore use the ESP Cycle Counter
81+
// to determine whether or not to perform an operation.
82+
// Pass in NULL to disable the callback and, if no other waveforms being
83+
// generated, stop the timer as well.
84+
// Make sure the CB function has the ICACHE_RAM_ATTR decorator.
85+
void setTimer1Callback(uint32_t (*fn)());
86+
87+
#ifdef __cplusplus
88+
}
89+
#endif
90+
91+
#endif // __ESP8266_WAVEFORM_H
92+
93+
#endif // WAVEFORM_LOCKED_PHASE

0 commit comments

Comments
 (0)