【前言】
评估MCU的性能的软件,基本上都是使用coremark进行跑分测试。
【软件环境】
coremark跑分跟编译环境有很大的关系,此次评估的软件环境如下:
1、编译工具:arm-none-eabi-gcc version 14.2.1 20241119 (Arm GNU Toolchain 14.2.Rel1 (Build arm-14.52)
2、编辑工具:vscode
【coremark移植】
1、源码下载网址:https://github.com/eembc/coremark
2、下载好后,解压,复制目录下的core_list_join.c、core_main.c、core_matrix.c、core_state.c、core_util.c、coremark.h,以及simple目录下面的core_portme.c、core_portme.h到工程中的coremark目下面:
3、在mm32f5370_it.c中增加uint32_t gTick变量,要SysTick_Handle函数中,添加gTick自增,并注释掉原来delay的判断。
/*********************************************************************************************************************** * @brief This function handles System tick timer * @note none * @param none * @retval none *********************************************************************************************************************/ void SysTick_Handler(void) { // if (0 != PLATFORM_DelayTick) // { // PLATFORM_DelayTick--; // } gTick++; }
4、在core_portme.c中添加paltform.h,hal_conf.h的头文件引用,以及gTick全局变量的extern声明:
#include "hal_conf.h" #include "platform.h" extern uint32_t gTick;
5、修改 EE_TICKS_PER_SEC 为1000.
#define EE_TICKS_PER_SEC 1000 // (NSECS_PER_SEC / TIMER_RES_DIVIDER)
6、修改三个time函数:
void start_time(void) { gTick = 0; SysTick_Config(((SystemCoreClock /1000)*1)); // GETMYTIME(&start_time_val); } void stop_time(void) { // GETMYTIME(&stop_time_val); SysTick->CTRL &= 0xFFFFFFFF; } CORE_TICKS get_time(void) { // CORE_TICKS elapsed // = (CORE_TICKS)(MYTIMEDIFF(stop_time_val, start_time_val)); // return elapsed; return (CORE_TICKS)gTick; }
7、在core_portme.h中新一条 ITERATIONS 的宏定义,来定义运行时间,如果定义太小,会运行不足10秒而提示错误,可以自行调整到符合运行时间:
#define ITERATIONS 6000
8、修改打印提示:
#define COMPILER_FLAGS \ "-Ofast" /* "Please put compiler flags here (e.g. -o3)" */ #endif
由于core_main.c中定义有main函数,所以需要把main.c中的main函数注释掉,把系统初始化的PLATFORM_Init添加进portable_init函数中:
void portable_init(core_portable *p, int *argc, char *argv[]) { PLATFORM_Init(); (void)argc; // prevent unused warning (void)argv; // prevent unused warning if (sizeof(ee_ptr_int) != sizeof(ee_u8 *)) { ee_printf( "ERROR! Please define ee_ptr_int to a type that holds a " "pointer!\n"); } if (sizeof(ee_u32) != 4) { ee_printf("ERROR! Please define ee_u32 to a 32b unsigned type!\n"); } p->portable_id = 1; }
【编译选项】
在makefile中定义优化等级为-Ofast 并配置CPU 、FPU、浮点运算
# 调试信息 DEBUG = 0 # 优化等级 OPT = -Ofast # 链接时优化 LTO = -flto ####################################### # 目标单片机配置信息 ####################################### # cpu CPU = -mcpu=cortex-m33 # fpu FPU = -mfpu=fpv4-sp-d16 #none # float-abi FLOAT-ABI = -mfloat-abi=hard #none
【测试结果】
BOARD : EVB-F5375 MCU : MM32F5375G8PV PLL (clocked by HSE) used as system clock source SYSCLK Frequency : 180.000 MHz HCLK Frequency : 180.000 MHz PCLK1 Frequency : 180.000 MHz PCLK2 Frequency : 180.000 MHz 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 13782 Total time (secs): 13.782000 Iterations/Sec : 435.350457 Iterations : 6000 Compiler version : GCC14.2.1 20241119 Compiler flags : -Ofast Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0xa14c Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 435.350457 / GCC14.2.1 20241119 -Ofast / STACK
测试成绩为435。
【结果分析】
根据用户手册,它的运行成绩应该是4.05*180M,应该跑分在720分左右。但是我这里只跑出来了他的一半多不到,我偿试使用了-Os、-Og、-Oz的优化等级,但是跑分都比435低。我感觉是不是我的makefile没有设置好,附makefile源文如下:
# 项目编译目标名 TARGET = template # 调试信息 DEBUG = 0 # 优化等级 OPT = -Ofast # 链接时优化 LTO = -flto # 编译临时文件目录 BUILD_DIR = build EXEC_DIR = build_exec # 模块导入 Core_DIR = Core include Core/Core.mk Device_DIR = Device include Device/Device.mk # LVGL_DIR = LVGL # include LVGL/lvgl.mk # C源文件宏定义 C_DEFS += -DUSE_STDPERIPH_DRIVER -DCUSTOM_HSE_VAL # C头文件目录 C_INCLUDES += # C源文件 C_SOURCES += # 链接库 LIBS += -lc -lm -lnosys # 库文件路径 LIBDIR += ####################################### # 编译器指定 ####################################### PREFIX = arm-none-eabi- # 启用下一项以指定GCC目录 #GCC_PATH = /Applications/ARM/bin/ ifdef GCC_PATH CC = $(GCC_PATH)/$(PREFIX)gcc AS = $(GCC_PATH)/$(PREFIX)gcc -x assembler-with-cpp CP = $(GCC_PATH)/$(PREFIX)objcopy DUMP = $(GCC_PATH)/$(PREFIX)objdump SZ = $(GCC_PATH)/$(PREFIX)size else CC = $(PREFIX)gcc AS = $(PREFIX)gcc -x assembler-with-cpp CP = $(PREFIX)objcopy DUMP = $(PREFIX)objdump SZ = $(PREFIX)size endif HEX = $(CP) -O ihex BIN = $(CP) -O binary -S ####################################### # 目标单片机配置信息 ####################################### # cpu CPU = -mcpu=cortex-m33 # fpu FPU = -mfpu=fpv4-sp-d16 #none # float-abi FLOAT-ABI = -mfloat-abi=hard #none # mcu MCU = $(CPU) -mthumb $(FPU) $(FLOAT-ABI) # compile gcc flags ASFLAGS = $(MCU) $(AS_DEFS) $(AS_INCLUDES) $(OPT) -Wall -fdata-sections -ffunction-sections CFLAGS += $(MCU) $(C_DEFS) $(C_INCLUDES) $(OPT) -Wall -fdata-sections -ffunction-sections ifeq ($(DEBUG), 1) CFLAGS += -g -gdwarf-2 endif # Generate dependency information CFLAGS += -MMD -MP -MF"$(@:%.o=%.d)" NO_COLOR = \033[00m OK_COLOR = \033[32m ERR_COLOR = \033[31m ####################################### # LDFLAGS ####################################### # libraries LDFLAGS = $(MCU) -T$(LDSCRIPT) $(LIBDIR) $(LIBS) -Wl,-Map=$(BUILD_DIR)/$(TARGET).map,--cref \ -Wl,--gc-sections -ffunction-sections --specs=nano.specs --specs=nosys.specs $(LTO) # 打开浮点打印 LDFLAGS += -lc -lrdimon -u _printf_float # default action: build all all: $(EXEC_DIR)/$(TARGET).elf $(EXEC_DIR)/$(TARGET).hex $(EXEC_DIR)/$(TARGET).bin POST_BUILD ####################################### # build the application ####################################### # list of objects OBJECTS += $(addprefix $(BUILD_DIR)/,$(notdir $(C_SOURCES:.c=.o))) vpath %.c $(sort $(dir $(C_SOURCES))) # list of ASM program objects OBJECTS += $(addprefix $(BUILD_DIR)/,$(notdir $(ASM_SOURCES:.S=.o))) vpath %.S $(sort $(dir $(ASM_SOURCES))) $(BUILD_DIR)/%.o: %.c Makefile | $(BUILD_DIR) @echo "[CC] $<" @$(CC) -c $(CFLAGS) -Wa,-a,-ad,-alms=$(BUILD_DIR)/$(notdir $(<:.c=.lst)) $< -o $@ $(BUILD_DIR)/%.o: %.S Makefile | $(BUILD_DIR) @echo "[AS] $<" @$(AS) -c $(CFLAGS) $< -o $@ $(EXEC_DIR)/$(TARGET).elf: $(OBJECTS) Makefile | $(EXEC_DIR) @echo "[LD] $@" @$(CC) $(OBJECTS) $(LDFLAGS) -o $@ $(EXEC_DIR)/%.hex: $(EXEC_DIR)/%.elf | $(EXEC_DIR) @echo "[HEX] $< -> $@" @$(HEX) $< $@ $(EXEC_DIR)/%.bin: $(EXEC_DIR)/%.elf | $(EXEC_DIR) @echo "[BIN] $< -> $@" @$(BIN) $< $@ $(BUILD_DIR): @mkdir $@ $(EXEC_DIR): @mkdir $@ .PHONY: POST_BUILD POST_BUILD: $(EXEC_DIR)/$(TARGET).elf ifeq ($(DEBUG), 1) @echo "[DUMP] $< -> $(EXEC_DIR)/$(TARGET).S" @$(DUMP) -d $< > $(EXEC_DIR)/$(TARGET).S endif @echo "[SIZE] $<" @$(SZ) $< @echo -e "$(OK_COLOR)Build Finish$(NO_COLOR)" ####################################### # 清除临时文件 ####################################### .PHONY: clean clean: @rm -rf $(BUILD_DIR) @echo -e "$(OK_COLOR)Clean Build Finish$(NO_COLOR)" .PHONY: cleanall cleanall: clean @rm -rf $(EXEC_DIR) @echo -e "$(OK_COLOR)Clean Exec Finish$(NO_COLOR)" ####################################### # 烧录程序 ####################################### .PHONY: flash flash: $(EXEC_DIR)/$(TARGET).elf @echo -e "$(OK_COLOR)Start pyOCD$(NO_COLOR)" @pyocd flash $< ####################################### # 构建并烧录程序 ####################################### .PHONY: run run: @make -j12 @make flash ####################################### # 依赖文件 ####################################### -include $(wildcard $(BUILD_DIR)/*.d) # *** EOF ***
附工程源码:
为了验证不同编译器的跑分,我在keil5下面也进行了跑分,结果如下:
[10:30:28.134]收←◆2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 14430 Total time (secs): 14.430000 Iterations/Sec : 415.800416 Iterations : 6000 Compiler version : GCCClang 16.0.0 Compiler flags : -Ofast Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0xa14c Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 415.800416 / GCCClang 16.0.0 -Ofast / STACK
跑出来的分值也是在415分值。
我重新修订了开发板的ROM跟RAM,在.sct文件中:
#define __ROM_BASE 0x008000000 #define __ROM_SIZE 0x000080000 /*--------------------- Embedded RAM Configuration --------------------------- ; <h> RAM Configuration ; <o0> RAM Base Address <0x0-0xFFFFFFFF:8> ; <o1> RAM Size (in Bytes) <0x0-0xFFFFFFFF:8> ; </h> *----------------------------------------------------------------------------*/ #define __RAM_BASE 0x20000000 #define __RAM_SIZE 0x00020000 /*--------------------- Stack / Heap Configuration --------------------------- ; <h> Stack / Heap Configuration ; <o0> Stack Size (in Bytes) <0x0-0xFFFFFFFF:8> ; <o1> Heap Size (in Bytes) <0x0-0xFFFFFFFF:8> ; </h> *----------------------------------------------------------------------------*/ #define __STACK_SIZE 0x00002000 #define __HEAP_SIZE 0x00002000
结果重新编译后,分值上升到了494分:
[10:42:56.389]收←◆ BOARD : EVB-F5375 MCU : MM32F5375G8PV PLL (clocked by HSE) used as system clock source SYSCLK Frequency : 180.000 MHz HCLK Frequency : 180.000 MHz PCLK1 Frequency : 180.000 MHz PCLK2 Frequency : 180.000 MHz [10:43:08.542]收←◆2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 12132 Total time (secs): 12.132000 Iterations/Sec : 494.559842 Iterations : 6000 Compiler version : GCCClang 16.0.0 Compiler flags : -Ofast Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0xa14c Correct operation validated. See README.md for run and reporting rules. CoreMark 1.0 : 494.559842 / GCCClang 16.0.0 -Ofast / STACK