Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to PROGMEM aware libc, allow PSTR in printf() #4160

Closed
wants to merge 21 commits into from

Conversation

earlephilhower
Copy link
Collaborator

@earlephilhower earlephilhower commented Jan 13, 2018

A Newlib (libc) patch is in progress to move the _P functions from inside
Arduino into first-class citizens in libc. This Arduino patch cleans up
code that's been migrated there. Binaries for the new libs are included
because it seems they're part of the Arduino git tree, and should be
replaced with @igrr built ones when/if the Newlib changes are accepted.
igrr/newlib-xtensa#5

Notable changes/additions for Arduino:
Allow for use of PROGMEM based format and parameter strings in all
*printf functions. No need for copying PSTR()s into RAM before printing
them out (transparently saves heap space when using _P functions) and
makes it easier to print out constant strings for applications.

Add "%S" (capital-S) format that I've been told, but cannot verify,
is used in Arduino to specify a PROGMEM string parameter in printfs,
as an alias for "%s" since plain "%s" can now handle PROGMEM.

Optimized the memcpy_P and strncpy_P functions to use 32-bit direct reads
whenver possible (source and dest alignment mediated), but there is
still room for improvement in others like *_P.

Finally, move several constant arrays from RODATA into PROGMEM and
update their accessors. Among these are the ctype array, ~260 bytes,
mprec* arrays, ~300 bytes, and strings/daycounts in the time
formatting functions, ~200 bytes. All told, sketches will see from
300 to 800 additional RAM heap free on startup (depending on their
use of these routines).

@earlephilhower
Copy link
Collaborator Author

The builds succeeded, but the host_tests failed due to standard x86 libc not having the new sys/pgmspace.h include file that was added to newlib. The build job would need to have the default, no-op newlib/libc/include/sys/pgmspace.h defined in the newlib patch submitted to newlib-xtensa.

@devyte
Copy link
Collaborator

devyte commented Jan 15, 2018

This sounds really cool, but moving things to libc feels weird to me, mostly because it'd mean we'd have a non-standard libc. I just did a quick search, and I realize that in the AVR world this is apparently nothing new, but it still feels weird.
Therefore, I can only give an opinion on style and coding, and not on purpose.
I did a review on the libc repo, linked above.

@earlephilhower
Copy link
Collaborator Author

Latest commit includes some comments from newlib repo, as well as accelerated strnlen_P code/

A Newlib (libc) patch is in progress to move the _P functions from inside
Arduino into first-class citizens in libc.  This Arduino patch cleans up
code that's been migrated there.  Binaries for the new libs are included
because it seems they're part of the Arduino git tree, and should be
replaced with @igrr built ones when/if the Newlib changes are accepted.

Notable changes/additions for Arduino:
Allow for use of PROGMEM based format and parameter strings in all
*printf functions.  No need for copying PSTR()s into RAM before printing
them out (transparently saves heap space when using _P functions) and
makes it easier to print out constant strings for applications.

Add "%S" (capital-S) format that I've been told, but cannot verify,
is used in Arduino to specify a PROGMEM string parameter in printfs,
as an alias for "%s" since plain "%s" can now handle PROGMEM.

Optimized the memcpy_P, strnlen_P, and strncpy_P functions to use 32-bit
direct reads whenver possible (source and dest alignment mediated), but
there is still room for improvement in others.

Finally, move several constant arrays from RODATA into PROGMEM and
update their accessors.  Among these are the ctype array, ~260 bytes,
mprec* arrays, ~300 bytes, and strings/daycounts in the time
formatting functions, ~200 bytes.  All told, sketches will see from
300 to 800 additional RAM heap free on startup (depending on their
use of these routines).
Rebuild the binaries using a git clone of
https://github.com/igrr/newlib-xtensa

Build commands for posterity:
````
rm -rf ./xtensa-lx106-elf/
./configure --prefix=<DIR>/esp8266/tools/sdk/libc --with-newlib \
            --enable-multilib --disable-newlib-io-c99-formats \
            --disable-newlib-supplied-syscalls \
            --enable-newlib-nano-formatted-io --enable-newlib-reent-small \
            --enable-target-optspace \
            --program-transform-name="s&^&xtensa-lx106-elf-&" \
            --disable-option-checking --with-target-subdir=xtensa-lx106-elf \
            --target=xtensa-lx106-elf
rm -f etc/config.cache
CROSS_CFLAGS="-fno-omit-frame-pointer -DSIGNAL_PROVIDED -DABORT_PROVIDED"\
             " -DMALLOC_PROVIDED" \
  PATH=<DIR>/esp8266/tools/xtensa-lx106-elf/bin/:$PATH \
  make all install
````
@earlephilhower earlephilhower added this to the 2.5.0 milestone Oct 27, 2018
@earlephilhower earlephilhower requested a review from d-a-v October 27, 2018 23:21
Include fix from newlib-xtensa/fix-strlen branch cleaning up misaligned
access on a non-aligned source string.
Ran the included test suite on ESP8266 tstring.c with the following defines:
 #define MAX_1 50
 #define memcmp memcmp_P
 #define memcpy memcpy_P
 #define memmem memmem_P
 #define memchr memchr_P
 #define strcat strcat_P
 #define strncat strncat_P
 #define strcpy strcpy_P
 #define strlen strlen_P
 #define strnlen strnlen_P
 #define strcmp strcmp_P
 #define strncmp strncmp_P

Uncovered edge case and return value problems in the optimized versions of
the strnlen_P and strncpy_P functions.  Corrected.
memcpy-1.c test suite showed error in return value of memcpy_P.  Correct it.
Random crashes, often on String constructors using a PSTR, would occur due
to the accelerated strnlen_P going past the end of the string. Would make
debug builds fail, too (ESP.getVersionString() failure).

Fix to fall through to normal copy on a word that's got a 0 byte anywhere
in it.
Copy link
Collaborator

@d-a-v d-a-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We found out that this feature is already enabled in the new SDK-v3-dev (our current one).
But this PR does the job much faster.
With @earlephilhower perf test (to be added in tests/?), sprintf is 5x faster and memcpy 10x faster than current SDK's versions when data source is in PROGMEM.

Add test suite used to debug libc optimized _P functions to the device
tests.
Rebuild .a from igrr's repo at 347260af117b4177389e69fd4d04169b11d87a97
@devyte
Copy link
Collaborator

devyte commented Nov 12, 2018

I didn't go in-depth into the code bits. I assume by the extensive device tests that functionality is sufficiently well covered.
Just a minor nitpick and a question, other than that, LGTM!

Copy link
Collaborator

@devyte devyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving after explanation

earlephilhower added a commit to earlephilhower/Arduino that referenced this pull request Nov 28, 2018
Bring newlib changes into the tree from original PR esp8266#4160
@earlephilhower
Copy link
Collaborator Author

Closing this as it is now part of the toolchain update in PR#4160 .

@earlephilhower
Copy link
Collaborator Author

Err, I mean it's in subsumed into PR #5376

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants