+2023-08-21 Paulo Andrade <pcpa@gnu.org>
+
+ * check/Makefile.am, check/lightning.c: Add new hmul tests.
+ * doc/body.texi: Document hmul.
+ * include/lightning.h.in: Create the new hmul codes.
+ * lib/jit_aarch64-cpu.c, lib/jit_aarch64-sz.c, lib/jit_aarch64.c,
+ lib/jit_alpha-cpu.c, lib/jit_alpha-sz.c, lib/jit_alpha.c,
+ lib/jit_arm-cpu.c, lib/jit_arm-sz.c, lib/jit_arm.c,
+ lib/jit_hppa-cpu.c, lib/jit_hppa-sz.c, lib/jit_hppa.c,
+ lib/jit_ia64-cpu.c, lib/jit_ia64-sz.c, lib/jit_ia64.c,
+ lib/jit_loongarch-cpu.c, lib/jit_loongarch-sz.c, lib/jit_loongarch.c,
+ lib/jit_mips-cpu.c, lib/jit_mips-sz.c, lib/jit_mips.c,
+ lib/jit_ppc-cpu.c, lib/jit_ppc-sz.c, lib/jit_ppc.c,
+ lib/jit_riscv-cpu.c, lib/jit_riscv-sz.c, lib/jit_riscv.c,
+ lib/jit_s390-cpu.c, lib/jit_s390-sz.c, lib/jit_s390.c,
+ lib/jit_sparc-cpu.c, lib/jit_sparc-sz.c, lib/jit_sparc.c,
+ lib/jit_x86-cpu.c, lib/jit_x86-sz.c, lib/jit_x86.c: Implement
+ hmul and update the *-sz.c files.
+ * lib/jit_names.c, lib/lightning.c: Add knowledge of hmul.
+
+2023-04-18 Paulo Andrade <pcpa@gnu.org>
+
+ * include/lightning.h.in: Define new fmar_f, fmai_f, fmsr_f,
+ fmsi_f, fmar_d, fmai_d, fmsr_d and fmsi_d instructions, that
+ add support for fused multiply add/sub, in the format
+ r0 = r1 * r2 +/- r3.
+ * include/lightning/jit_private.h: Add helper macros for debug
+ output.
+ * lib/jit_names.c: Add strings for debug output.
+ * lib/jit_print.c: Print debug output for the new instructions.
+ * lib/lightning.c: Add logic for the new register pair in the
+ 'v' (second) field of jit_node_t. The new pattern is required
+ to allow having a 'double' immediate in the last argument, for
+ the versions with immediates. The versions with immediates are
+ added for consistency, as they should be very rarely used in
+ common usage of fused multiply add/sub.
+
+2023-04-06 Paulo Andrade <pcpa@gnu.org>
+
+ * include/lightning.h.in: Define new movi_w_f, movi_w_d and
+ movi_ww_d instructions, to have an inverse counterpart to
+ movi_f_w, movi_d_w and movi_d_ww.
+ * lib/lightning.c: Update for the new instructions.
+ * lib/jit_names.c, lib/jit_print.c: Update debug information.
+
+2023-04-05 Paulo Andrade <pcpa@gnu.org>
+
+ * include/lightning.h.in: Define new unldr, unldi, unldr_x,
+ unldi_x, unstr, unsti, unldr_x and unstr_x instructions.
+ Remove comment about internal backend specific codes, as they
+ are required by unldr_x and unstr_x in some code paths.
+ * lib/lightning.c: Implement generic movi_f_w, movi_d_w and
+ movi_d_ww that are actually the same for all ports.
+ Define generic load/store implementations when unaligned memory
+ access does not trap.
+
+2023-03-23 Paulo Andrade <pcpa@gnu.org>
+
+ * include/lightning.h.in: Define new qlshr, qlshi, qlshr_u,
+ qlshi_u, qrshr, qrshi, qrshr_u and qrshi_u instructions.
+ * lib/jit_fallback.c: Implement new fallbacks.
+ * lib/jit_names.c: Update debug information.
+ * lib/lightning.c: Add code to update regsets, matching other
+ instructions with two outputs.
+
+2023-03-20 Paulo Andrade <pcpa@gnu.org>
+
+ * check/all.tst: Add missing instructions to debug encoding.
+ * check/lightning.c: Implement calls to the new rich set of
+ instructions, with an immediate argument, mostly which are resolved
+ at code generation time. With the exception of jit_negi_{f,d},
+ jit_absi{f,d} and jit_sqrti_{f,d} that generate code to execute at
+ runtime. This is required because the code generator should create
+ the proper float environment with rounding modes, exceptions, etc.
+ The new jit_depi is just a wrapper to have the second operand as an
+ immediate and call jit_depr.
+ * include/lightning.h.in: Declare new instructions code and function
+ prototypes as appropriate.
+ * include/lightning/jit_private.h: Add 4 new macros to generate
+ synthetic debug for float operations with an immediate argument.
+ * lib/jit_aarch64.c, lib/jit_alpha.c, lib/jit_arm.c, lib/jit_hppa.c,
+ lib/jit_ia64.c, lib/jit_loongarch.c, lib/jit_mips.c, lib/jit_ppc.c,
+ lib/jit_riscv.c, lib/jit_s390.c, lib/jit_sparc.c, lib/jit_x86.c:
+ Add code to call the generic code to implement new instructions
+ with immediate operands.
+ * lib/jit_names.c, lib/jit_print.c: Add debug for the new instructions
+ with immediate operands.
+ * lib/lightning.c: Add code to handle regsets and actually implement
+ the generic new instructions.
+
+2023-03-17 Paulo Andrade <pcpa@gnu.org>
+
+ * lib/jit_fallback.c: Implement fallbacks for new instructions
+ ext, ext_u and dep.
+ * lib/lightning.c: Add code to understand the new instructions
+ and update regsets as appropriate.
+ * lib/jit_names.c, lib/jit_print.c: Update for debug information
+ of ext, ext_u and dep.
+ * include/lightning.h.in: Define jit_code_t for ext, ext_u and dep.
+ * check/lightning.c: Handle the new instructions.
+ * check/all.tst: Add new instructions for the full disassembly.
+
+2023-03-07 Paulo Andrade <pcpa@gnu.org>
+
+ * check/alu_rot.tst, check/alu_rot.ok: New test files for the new
+ lrotr, lroti, rrotr and rroti instructions.
+ * check/Makefile.am, check/lightning.c, include/lightning.h.in,
+ lib/jit_names.c: lib/lightning.c, doc/body.texi: Update for the
+ new instructions.
+ * lib/jit_aarch64-cpu.c, lib/jit_aarch64.c, lib/jit_arm-cpu.c,
+ lib/jit_arm.c: Implement optimized rrotr and rroti. lrotr and
+ lroti just adjust parameters for a left shift rotate.
+ * lib/jit_alpha-cpu.c, lib/jit_alpha.c, lib/jit_ia64-cpu,
+ lib/jit_ia64.c, lib/jit_riscv-cpu.c, lib/jit_riscv.c,
+ jit_sparc-cpu.c, jit_sparc.c: Implement calls to fallback lrotr,
+ lroti, rrotr and rroti.
+ * lib/jit_hppa-cpu.c, lib/jit_hppa.c: Implement optimized rroti.
+ Other instructions use fallbacks.
+ * lib/jit_loongarch-cpu.c, lib/jit_loongarch.c: Implement optimized
+ rrotr and rroti. lrotr and lroti just adapt arguments and use a
+ right shift.
+ * lib/jit_mips-cpu.c, lib/jit_mips.c: If mips2, Implement optimized
+ rrotr and rroti. lrotr and lroti just adapt arguments and use a
+ right shift. If mips1 use fallbacks.
+ * lib/jit_ppc-cpu.c, lib/jit_ppc.c, jit_s390-cpu.c, jit_s390.c,
+ lib/jit_x86-cpu.c, lib/jit_x86.c: Implement optimized lrotr,
+ lroti, rrotr, rroti.
+ * lib/jit_fallback.c: Implement fallbacks for lrotr, lroti,
+ rrotr and rroti. Also add extra macro to avoid segfaults in s390,
+ that cannot use register zero for some addressing instructions.
+
+2023-03-02 Paulo Andrade <pcpa@gnu.org>
+
+ * check/popcnt.tst, check/popcnt.ok: New test files for the new
+ popcntr instruction.
+ * check/Makefile.am, check/lightning.c, include/lightning.h.in,
+ lib/jit_names.c: lib/lightning.c, doc/body.texi: Update for popcntr.
+ * lib/jit_aarch64-fpu.c, lib/jit_aarch64.c: Implement optimized
+ popcntr using the fpu.
+ * lib/jit_alpha-cpu.c, lib/jit_alpha.c: Implement optimized
+ popcntr using the ctpop instruction.
+ * lib/jit_arm-vfp.c, lib/jit_arm-cpu.c, lib/jit_arm.c: Implement
+ untested optimized popcntr using vfp >= 4, otherwise use a
+ software fallback.
+ * lib/jit_ia64-cpu.c, lib/jit_jia64.c: Implement optimized
+ popcntr using the popcnt instruction.
+ * lib/jit_ppc-cpu.c, lib/jit_ppc.c: Implement optimized
+ popcntr using the popcntb, plus mullr and rshi_u instruction.
+ * lib/jit_x86-cpu.c, lib/jit_x86.c: Implement optimized
+ popcntr instruction using the popcnt instruction if available,
+ otherwise use an optimized fallback.
+ * lib/jit_fallback.c: Implement simple fallback popcnt.
+ * lib/jit_hppa.c, lib/jit_loongarch.c, lib/jit_mips.c,
+ lib/jit_riscv.c, lib/jit_s390.c, lib/jit_sparc.c: Use fallback
+ popcnt.
+
+2023-02-26 Paulo Andrade <pcpa@gnu.org>
+
+ * check/bit.tst: Correct 32 bit sample ctz implementation.
+ * include/lightning/jit_mips.h: Add jit_cpu flags for instructions
+ that cannot be used in delay slot.
+ * lib/jit_fallback.c: Mips fallbacks now might need a flush of
+ instructions to get correct label addresses, due to pending
+ instruction candidate to delay slot.
+ * lib/jit_mips-cpu.c: Flush any pending instruction if it cannot
+ be used in the delay slot. Add calls to fallback clo, clz, cto and
+ ctz for mips 1.
+ * lib/jit_mips.c: Add code to set defaults or detect if can use
+ certain instructions to delay slots.
+
+2023-02-23 Paulo Andrade <pcpa@gnu.org>
+
+ * include/lightning/jit_private.h: Add new 'inst' field to
+ jit_compiler_t, if __mips__ is defined. This field is a simple
+ helper for a pending instruction to be emitted, and that can
+ be emitted out of order.
+ * lib/jit_fallback.c: Update for changes in internal mips patching
+ and jumping macros and function calls.
+ * lib/jit_mips-cpu.c: Core of changes to attempt to fill delay
+ slots with instructions that can be emitted out of order.
+ * lib/jit_mips-fpu.c: Update to use delay slot in branches.
+ * lib/jit_mips.c: Update for new delay slot use logic.
+
+2023-02-20 Paulo Andrade <pcpa@gnu.org>
+
+ * check/float.tst: Add conditionals for mips release for expected
+ NaN truncated to an integer.
+ * check/lightning.c: Add extra preprocessor for mips release.
+ * include/lightning/jit_mips.h: Make the NEW_ABI preprocessor
+ defined to zero if using the n32 or n64 abis. This makes it
+ easier to create runtime checks with an always true or false
+ condition.
+ * lib/jit_mips-cpu.c, lib/jit_mips-fpu.c: Implement mips release
+ 6 support.
+ * lib/jit_mips.c: Add more reliable mips release detection code.
+
+2023-02-09 Paulo Andrade <pcpa@gnu.org>
+
+ * check/Makefile.am: Update for new bit.tst test, to check the
+ new clor, clzr, ctor and ctzr instructions.
+ * check/all.tst: Update to verify encoding of new instructions.
+ * check/lightning.c: Update to have the lightning "assembler"
+ understanding the new instructions.
+ * include/lightning.h.in: Define new codes for new instructions.
+ * lib/jit_aarch64.c, lib/jit_alpha.c, lib/jit_arm.c, lib/jit_hppa.c,
+ lib/jit_ia64.c, lib/jit_loongarch.c, lib/jit_mips.c, lib/jit_ppc.c,
+ lib/jit_riscv.c, lib/jit_s390.c, lib/jit_sparc.c, lib/jit_x86.c:
+ Implement fallback version of new instructions.
+ * lib/jit_fallback.c: Actual implementation of the fallbacks of
+ the new instructions.
+ * lib/jit_names.c: Update to print debug information of new
+ instructions.
+
+2023-01-26 Paulo Andrade <pcpa@gnu.org>
+
+ * check/riprel.c, check/riprel.ok: New check files.
+ * check/Makefile.am: Support for new riprel test.
+ * lib/jit_x86-cpu.c, lib/jit_x86-sse.c, lib/jit_x86.c: Implement
+ %rip relative addressing when reliable. Currently disabled for
+ x32 and _WIN32; could be added for positive relative addresses
+ only where it should work.
+ * lib/lightning.c: Correct problem added in previous patch due
+ to not testing on a 32 bit environment.
+
+2023-01-23 Paulo Andrade <pcpa@gnu.org>
+
+ * lib/jit_mips-cpu.c, lib/jit_mips-cpu.c: Use pseudo instructions
+ "b" (BEQ(0,0,disp)) and "bal" (BGEZAL(0,disp)) for mips2, when an
+ unconditional branch or function call is known to be in range of a
+ relative jump. This should significantly reduce jit size generation.
+
+2023-01-20 Paulo Andrade <pcpa@gnu.org>
+
+ * lib/jit_mips-cpu.c, lib/jit_mips.c, lib/jit_rewind.c: Adapt
+ code to implement a variable framesize and optimize frame pointer
+ for simple leaf functions.
+
+2023-01-19 Paulo Andrade <pcpa@gnu.org>
+
+ * lib/jit_riscv.c, lib/jit_riscv-cpu.c: Adapt code to use a
+ variable framesize. Previously it was aligning the stack at
+ 8 bytes, not 16. Now functions are called with a 16 byte aligned
+ stack.
+
+2023-01-18 Paulo Andrade <pcpa@gnu.org>
+
+ * include/lightning/jit_private.h: Include new framesize field
+ of jit_compiler_t; add new alist field for jit_function_t; add
+ new cvt_offset and need_stack fields specific to x86.
+ * lib/jit_x86.c, lib/jit_x86-cpu: Rewrite code to create stack
+ frames, so that less stack space can be used if no, or very few
+ callee save registers are modified in a function.
+ * jit_x86-sse.c, jit_x86-x87.c: Make CVT_OFFSET variable, and
+ dynamically allocated; this is required to avoid needing to
+ modify twice %rsp at function prologs, even if no stack space
+ is used.
+
+2022-11-09 Paulo Andrade <pcpa@gnu.org>
+
+ * configure.ac: Add new --enable-devel-strong-type-checking
+ option.
+ * include/lightning.h.in: Rework to not need to know if
+ PACKED_STACK is defined, and add a new argument to _jit_arg,
+ _jit_putarg{r,i}, _jit_pusharg{r,i} and _jit_ret{r,i} to have
+ the same code path if PACKED_STACK is defined or not, and also
+ to implement STRONG_TYPE_CHECK enabled with the new
+ --enable-devel-strong-type-checking.
+ * include/lightning/jit_private.h: Add new macros to add assertions
+ for STRONG_TYPE_CHECK and avoid pasting tokens in jit_inc_synth*
+ when the token is not a static known value.
+ * lib/jit_aarch64.c: The first implementation of the new code,
+ working correctly in Apple M1 and with and without STRONG_TYPE_CHECK
+ in Linux.
+
+2022-11-08 Paulo Andrade <pcpa@gnu.org>
+
+ Add support for packed stack arguments as used by Apple M1
+ aarch64 cpus. This requires a major redesign in how Lightning
+ works, because contrary to all other supported ports, in this
+ case arguments must be truncated and sign/zero extended if
+ passed in registers, but when receiving the argument, there
+ is no need to truncate and sign/zero extend.
+ Return values are also treated this way. The callee must
+ truncate sign/zero extend, not the caller.
+ check/Makefile.am: Add LIGHTNING_CFLAGS to AM_CFLAGS.
+ check/all.tst: Implement paired arg/getarg/pusharg/putarg/ret
+ codes to validate they do not generate assertions.
+ * check/allocar.tst, check/call.tst, check/fib.tst, check/put.tst,
+ check/stack.tst: Update to pass in all build types.
+ check/lightning.c: Add new codes for extra codes to handle
+ packed stack.
+ * configure.ac: Add a preprocessor define to know if packed stack
+ need is required. This is not really used, as it was moved to
+ jit_aarch64.h.
+ * doc/Makefile.am: Add LIGHTNING_CFLAGS to AM_CFLAGS.
+ * doc/rpn.c: Update to pass in all build types.
+ include/lightning.h.in: Add new codes and reorder enum.
+ * include/lightning/jit_aarch64.h: Detect condition of needing
+ a packed stack.
+ * lib/jit_aarch64-sz.c: Regenerate.
+ * lib/jit_aarch64.c: Major updates for packed stack.
+ * lib/jit_names.c: Updates for debug output.
+ * lib/lightning.c: Update for new codes.
+
+2022-10-31 Marc Nieper-Wißkirchen <marc@nieper-wisskirchen.de>
+
+ Add new skip instruction.
+ * .gitignore: Update from Gnulib.
+ * check/Makefile.am: Add tests.
+ * check/lightning.c: Handle skip instructions.
+ * check/protect.c: Rewrite with skip.
+ * check/skip.ok: New test.
+ * check/skip.tst: New test.
+ * doc/body.texi: Document the skip instruction.
+ * include/lightning.h.in: Add the skip instruction.
+ * lib/jit_aarch64-sz.c: Update for skip instruction.
+ * lib/jit_aarch64.c: Implement skip instruction.
+ * lib/jit_alpha-sz.c: Update for skip instruction.
+ * lib/jit_alpha.c: Implement skip instruction.
+ * lib/jit_arm-sz.c: Update for skip instruction.
+ * lib/jit_arm.c: Implement skip instruction.
+ * lib/jit_hppa-sz.c: Update for skip instruction.
+ * lib/jit_hppa.c: Implement skip instruction.
+ * lib/jit_ia64-sz.c: Update for skip instruction.
+ * lib/jit_ia64.c: Implement skip instruction.
+ * lib/jit_loongarch-sz.c: Update for skip instruction.
+ * lib/jit_loongarch.c: Implement skip instruction.
+ * lib/jit_mips-sz.c: Update for skip instruction.
+ * lib/jit_mips.c: Implement skip instruction.
+ * lib/jit_names.c: Update for skip instruction.
+ * lib/jit_ppc-sz.c: Update for skip instruction.
+ * lib/jit_ppc.c: Implement skip instruction.
+ * lib/jit_riscv-sz.c: Update for skip instruction.
+ * lib/jit_riscv.c: Implement skip instruction.
+ * lib/jit_s390-sz.c: Update for skip instruction.
+ * lib/jit_s390.c: Implement skip instruction.
+ * lib/jit_size.c: Treat align and skip in a special way.
+ * lib/jit_sparc-sz.c: Update for skip instruction.
+ * lib/jit_sparc.c: Implement skip instruction.
+ * lib/jit_x86-sz.c: Update for skip instruction.
+ * lib/jit_x86.c: Implement skip instruction.
+ * lib/lightning.c: Classify skip instruction.
+
+2022-10-30 Marc Nieper-Wißkirchen <marc@nieper-wisskirchen.de>
+
+ Add user-visible functions jit_protect and jit_unprotect.
+ * check/Makefile.am: Add test for jit_protect and jit_unprotect.
+ * check/protect.c: New test.
+ * doc/body.texi: Add documentation for jit_protect and
+ jit_unprotect.
+ * include/lightning.h.in: Add prototypes for jit_protect and
+ jit_unprotect.
+ * include/lightning/jit_private.h: Add a field to store the size
+ of the protected memory.
+ * lib/lightning.c: Remember the size of the protected memory and
+ implement the two new functions.
+
2022-10-12 Paulo Andrade <pcpa@gnu.org>
* include/lightning/jit_loongarch.h, lib/jit_loongarch-cpu.c,