FruityNutPi
Posts: 28
Joined: Sun Jun 17, 2018 1:26 pm

SWD timing

Tue May 30, 2023 3:30 pm

I am curious how much time needs to elapse between successive read or write operations on an SWD bus?

I am trying to figure out why two programs that appear to be sending the exact same thing (at least from the LA trace and decoder) yeild different results?

Specifically, I am powering up the debug domain successfully and trying to read the IDR of the first AP with:

Code: Select all

  uint8_t ack = 0;
  uint32_t apselect = APaddr&0x000000FF;

  apselect = (apselect<<24) | 0x000000F0;   // Select APaddr and bank 0xF
  ack = writePacket(setHostByte(SWD_DP, SWD_WR, SWD_DPR_SELECT), apselect);
  delay(5);
  if (ack==1) ack = readPacket(setHostByte(SWD_AP, SWD_RD, SWD_APR_REG_IDR), apidr);
  delay(5);
  if (ack==1) ack = readPacket(setHostByte(SWD_DP, SWD_RD, SWD_DPR_RDBUFF), apidr);
  delay(5);
I am comparing my Arduino effort with a program for the Pi called Picoreg for which the LA shows the result:

Code: Select all

WSELECT OK  0x000000F0
R APc   OK  0x00000000
RDBUFF  OK  0x04770031
For my program, the result is as follows:

Code: Select all

WSELECT OK  0x000000F0
R APc   OK  0x00000000
RDBUFF  OK  0x00000000
The trace shows exactly the same pattern with the exception of the data read in the third step.

There is a difference with the timing. The Pi is faster and has approximately 5.8ms gams between each packet burst. I tried adding gaps of similar length but this made no difference to the result. I am not sure what I am missing, but if its not timing or content, then I don't know what else to look for. I have attached a couple of screenshots comparing a burst from picoreg to the same trace generated by my program.
Attachments
trace-mine.png
trace-mine.png (51.42 KiB) Viewed 2021 times
trace-picoreg.png
trace-picoreg.png (51.65 KiB) Viewed 2021 times

jayben
Posts: 558
Joined: Mon Aug 19, 2019 9:56 pm

Re: SWD timing

Tue May 30, 2023 7:08 pm

Your version does look rather slow; the problem isn't so much the gap between transactions (which can be really long) but the timing within a transaction. There does seem to be a minimum speed below which SWD doesn't work, and also a few cycles into your trace there is a gap of around 300 usec in the clock and data. In the 'how it works' section at https://iosoft.blog/picoreg/ I do warn that this might be a problem.

I'd start by speeding up the Arduino clock rate, and maybe disable interrupts during each transaction so there are no gaps in it.

hippy
Posts: 14352
Joined: Fri Sep 09, 2011 10:34 pm
Location: UK

Re: SWD timing

Tue May 30, 2023 9:28 pm

I would note that your SWDCLK idles high where 'picoreg' idles low. Don't know if that's the problem but would be worth checking anyway.

There are also some glitches which I might guess is contention from both sides trying to output at the same time.

When I was using bit-banged MicroPython and DBGFORCE to allow a Pico to debug its own Core 1, which I presume is driving actual debug hardware in the same way it would be driven by a real SWD connection, I found the clock rate could go really low but haven't tested with real SWD.

From when I was surfing I recall ARM documents suggesting 1kHz was the minimum, but it does seem chip dependent if others are to be believed.

FruityNutPi
Posts: 28
Joined: Sun Jun 17, 2018 1:26 pm

Re: SWD timing

Wed May 31, 2023 11:09 am

jayben wrote:
Tue May 30, 2023 7:08 pm
Your version does look rather slow; the problem isn't so much the gap between transactions (which can be really long) but the timing within a transaction. There does seem to be a minimum speed below which SWD doesn't work, and also a few cycles into your trace there is a gap of around 300 usec in the clock and data. In the 'how it works' section at https://iosoft.blog/picoreg/ I do warn that this might be a problem.

I'd start by speeding up the Arduino clock rate, and maybe disable interrupts during each transaction so there are no gaps in it.
Thanks. I tested speeding up the clock rate by up to x10 which is about as far as I can push it. The results shown here are speeded up by a factor of 4. However, I am not yet sure how to disable interrupts on the Pico. This is something for me to research, but I have noticed that there is some un-eaveness in the clock cycles. I generally try to do all the calculations prior to transmisssion rather than on the fly, but am still seeing these random delays. I am not sure how fast a Pico can go. Its MCU apparently runs at 80MHz, but the Pi 3 running Python obviously runs much faster.
hippy wrote:
Tue May 30, 2023 9:28 pm
I would note that your SWDCLK idles high where 'picoreg' idles low. Don't know if that's the problem but would be worth checking anyway.
That's a good spot and I did some work on correcting that today. Thanks.

I have also checked that the SWDIO line is not being driven during a turnaround. I simply set it to input_pullup at the start of the cycle and to output high after the cycle if the program is due to drive the line. However, I do still see a glitch at the end of a read cycle.
hippy wrote:
Tue May 30, 2023 9:28 pm
When I was using bit-banged MicroPython and DBGFORCE to allow a Pico to debug its own Core 1, which I presume is driving actual debug hardware in the same way it would be driven by a real SWD connection, I found the clock rate could go really low but haven't tested with real SWD.
That's an interesting experiment that I haven't tried yet. I did actually also wonder whether picoreg_gpio.py would run in micropython on the Pico but hadn't got around to trying that yet. At one point I was running my code bitbanging a couple of GPIO pins connected to the SWD port of the Pico it was running on and the Pico reset as soon as the debug domain was powered up. However, I didn't switch cores and was targeting the default core 0, not core 1, so the result was hardly surprising. Presently, I am using two Picos with one running the code and the other as the DUT.

I think my trace now looks a lot more like the picoreg one, but, disappointingly, the result has not changed. I am still getting a nil response for the first AP IDR. Obviously my code is still doing something incorrectly. Updated trace attached.
Attachments
trace-mine-wselect-02.png
trace-mine-wselect-02.png (40.22 KiB) Viewed 1941 times
trace-mine-rapc-02.png
trace-mine-rapc-02.png (40.33 KiB) Viewed 1941 times
trace-mine-rdbuff-02.png
trace-mine-rdbuff-02.png (40.81 KiB) Viewed 1942 times

fanoush
Posts: 1048
Joined: Mon Feb 27, 2012 2:37 pm

Re: SWD timing

Wed May 31, 2023 2:05 pm

May be timing of SWCLK vs the SWDIO, on the picoreg image it can be seen the pulses go a bit later after the data is set, in your ones you maybe pulse CLK high too early when the data is not there yet? I'd put some tiny delay after setting DIO before pulsing CLK.

timing is described here https://developer.arm.com/documentation ... quirements

FruityNutPi
Posts: 28
Joined: Sun Jun 17, 2018 1:26 pm

Re: SWD timing

Wed May 31, 2023 6:21 pm

Thank you for that link. It is exactly what I was looking for earlier on! I was sure I had read that and was looking for it to remind myself of the details.

Can I ask for a clarification of "on the rising edge of" ?

Does this mean that the data is read while the signal is still low and just prior to it rising high, or, immediately after the rising edge is detected but before high part of the duty cycle elapses? I rather suspect the latter and will adjust for that as its not quite what I have at the moment, but I guess what I am asking is should the bit on swdio be set before or just after the rising edge? Sorry if its a daft question.

On reflection, I think maybe your comment answers that. Data set first then pulse clock with enough delay to allow the data to settle on the line. At least, I think that's what I am reading here?

I just had one of those "oh what have I done" half hours because although the code was working as before in the terminal, the decoder in Pulseview was having trouble decoding it and I feared I had messed something up. Well it turned out to be a change I had made to the function that generates the reset pulses. I had a for loop in there to generate 52 pulses (50 being the minimum according to the standard) , but later on added a new function which accepts a parameter specifying the number of clock cycles to generate among other things. A new call was added to this function, but I forgot to comment out the older code, which meant that the program was actually sending 2 x 52 clock cycles. I then commented out the old code which reduced the number of cycles to 52 again, but for some reason this confused the decoder. I have now set the reset length to 64 cycles and the decoder seems happy again.

hippy
Posts: 14352
Joined: Fri Sep 09, 2011 10:34 pm
Location: UK

Re: SWD timing

Wed May 31, 2023 8:12 pm

FruityNutPi wrote:
Wed May 31, 2023 6:21 pm
I had a for loop in there to generate 52 pulses (50 being the minimum according to the standard)
That triggered a recollection that there is a requirement for two high bits to have been clocked in before the hardware will recognise the start bit of an SWD command. I simply clocked out two high bits prior to sending the start bit.

I can't recall if leaving them out caused a problem but it seemed I could clock out any number of high bits pre start bit without problem. It will likely depend on what one has previously clocked out.

In your 'trace-mine-wselect-02.png' there's no high for the last two bits as there is in the other images. Maybe ... ?

Getting the clocking right is what I struggled with as it's a case of 'get it wrong and it doesn't work' and it's not easy to tell why it hasn't.

I screwed up my clocking in the code I was working on but hadn't realised, ended up putting it to one side for a year before coming back to it and finally getting it right.

Lurk101's port of the Circle SWD Loader - viewtopic.php?p=2085604#p2085604 - and 'picoreg' were my 'go to resources' which eventually sorted me out

This is my code, MicroPython, but it may help -

Code: Select all

# rp2040-datasheet.pdf
#
# 2.3.4.1. Software control of SWD pins
#
# The SWD pins for Core 0 and Core 1 can be bit-banged via registers in
# SYSCFG (see DBGFORCE). This means that Core 1 could run a USB application
# that allows debug of Core 0, or similar.
#
# Or we could run MicroPython and bit-bang SWD for core 1 entirely from
# within MicroPython.

DEBUG    = True         # Show debugging interactions

ON_CHIP  = True         # Use the on-chip DBGFORCE register

SWDIO_PIN = 16          # Pins for off-chip debugging
SWCLK_PIN = 15

# Start a thread on Core 1 ...

import _thread
def Core1():
  while True:
    pass
_thread.start_new_thread(Core1, [])

# .------------------------------------------------------------------------.
# |     Utility code                                                       |
# `------------------------------------------------------------------------'

from machine import mem32

failed = False

def Failed(b):
  global failed
  if b:
    failed = True
  return failed

def Debug(s=""):
  if DEBUG > 0:
    if s.startswith("-"):
      print("." + "-" * 76 + ".")
      print("|    {:<72}|".format(s[1:]))
      print("`" + "-" * 76 + "'")
      print("")
    else:
      print(s)

def Hex(n, w=8):
  if n < 0:
    n = (~n) + 1
  s = hex(n)[2:].upper()
  if s.endswith("L"):
    s = s[:-1]
  if len(s) < w:
    s = ("0" * (w-len(s))) + s
  return s

def Bin(n, w=8):
  s = bin(n)[2:]
  if len(s) < w:
    s = ("0" * (w-len(s))) + s
  return s

# .------------------------------------------------------------------------.
# |     On-chip debugging hardware registers                               |
# `------------------------------------------------------------------------'

SYSCFG   = 0x40004000
CONFIG   = SYSCFG + 0x08
DBGFORCE = SYSCFG + 0x14

PROC1_ATTACH = 7 # Attach Core 1 debug port to software controls
PROC1_SWCLK  = 6 # Drive Core 1 SWCLK
PROC1_SWDI   = 5 # Drive Core 1 SWDIO input
PROC1_SWDO   = 4 # The value of Core 1 SWDIO output

PROC0_ATTACH = 3 # Attach Core 0 debug port to software controls
PROC0_SWCLK  = 2 # Drive Core 0 SWCLK
PROC0_SWDI   = 1 # Drive Core 0 SWDIO input
PROC0_SWDO   = 0 # The value of Core 0 SWDIO output

# .------------------------------------------------------------------------.
# |     GPIO registers                                                     |
# `------------------------------------------------------------------------'

SIO_BASE            = 0xD0000000

GPIO_IN             = SIO_BASE + 0x004

GPIO_OUT            = SIO_BASE + 0x010
GPIO_OUT_SET        = SIO_BASE + 0x014
GPIO_OUT_CLR        = SIO_BASE + 0x018
GPIO_OUT_XOR        = SIO_BASE + 0x01C

# .------------------------------------------------------------------------.
# |     DP and AP register definitions                                     |
# `------------------------------------------------------------------------'

AP, DP = 1, 0
RD, WR = 1, 0

RD_DP_IDCODE , WR_DP_ABORT  = [RD, DP, 0x0, "IDCODE"], [WR, DP, 0x0, "ABORT" ]
RD_DP_STAT   , WR_DP_CTRL   = [RD, DP, 0x4, "STAT"  ], [WR, DP, 0x4, "CTRL"  ]
RD_DP_RESEND , WR_DP_SELECT = [RD, DP, 0x8, "RESEND"], [WR, DP, 0x8, "SELECT"]
RD_DP_RDBUF  , WR_DP_TGTSEL = [RD, DP, 0xC, "RDBUF" ], [WR, DP, 0xC, "TGTSEL"]

DP_ABORT_STKCMPCLR_SHL   = 1
DP_ABORT_STKERRCLR_SHL   = 2
DP_ABORT_WDERRCLR_SHL    = 3
DP_ABORT_ORUNERRCLR_SHL  = 4

DP_STAT_ORUNDETECT_SHL   = 0
DP_STAT_STICKYERR_SHL    = 5
DP_STAT_CDBGPWRUPREQ_SHL = 28
DP_STAT_CDBGPWRUPACK_SHL = 29
DP_STAT_CSYSPWRUPREQ_SHL = 30
DP_STAT_CSYSPWRUPACK_SHL = 31

DP_SELECT_DPBANKSEL_SHL = 0
DP_SELECT_APBANKSEL_SHL = 4
DP_SELECT_APSEL_SHL     = 24

APBANKSEL_AHB_AP  = 0x00
APBANKSEL_APB_AP  = 0x01
APBANKSEL_JTAG_AP = 0x02

DPBANKSEL_CTRL_STAT = 0x0 # What appears at DP_CTRL/DP_STAT register 0x4
DPBANKSEL_DLCR      = 0x1
DPBANKSEL_TARGETID  = 0x2
DPBANKSEL_DLPIDR    = 0x3

RD_AP_CSW , WR_AP_CSW = [RD, AP, 0x0, "CSW"], [WR, AP, 0x0, "CSW"]
RD_AP_TAR , WR_AP_TAR = [RD, AP, 0x4, "TAR"], [WR, AP, 0x4, "TAR"]
RD_AP_DRW , WR_AP_DRW = [RD, AP, 0xC, "DRW"], [WR, AP, 0xC, "DRW"]

AP_CSW_SIZE_32BITS     = 2
AP_CSW_ADDR_INC_SINGLE = 1
AP_CSW_DEVICE_EN       = 1
AP_CSW_PROT_DEFAULT    = 0x22
AP_CSW_DBG_SW_EN       = 1

AP_CSW_SIZE_SHL        = 0
AP_CSW_ADDR_INC_SHL    = 4
AP_CSW_DEVICE_EN_SHL   = 6
AP_CSW_PROT_SHL        = 24
AP_CSW_DBG_SW_EN_SHL   = 31

DHCSR                  = [0xE000EDF0, "DHCSR"]
DHCSR_C_DEBUG_EN       = 1
DHCSR_C_HALT           = 1
DHCSR_DBGKEY_KEY       = 0xA05F
DHCSR_C_DEBUG_EN_SHL   = 0
DHCSR_C_HALT_SHL       = 1
DHCSR_DBGKEY_SHL       = 16

DCRSR                  = [0xE000EDF4, "DCRSR"]
DCRSR_REGSEL_R15       = 15 # PC
DCRSR_REGW_N_R         = 1
DCRSR_REGSEL__SHL      = 0
DCRSR_REGW_N_R_SHL     = 16

DCRDR                  = [0xE000EDF8, "DCRDR"]

# .------------------------------------------------------------------------.
# |     Lowest level on-chip debugging interface                           |
# |------------------------------------------------------------------------|
# |     OnChip_Pins()         Report DBGFORCE value                        |
# |     OnChip_SetAttach(b)   Set attach bit           0/1                 |
# |     OnChip_SetSwdio(b)    Set data bit             0/1                 |
# |     OnChip_GetSwdio()     Read data bit            0/1                 |
# |     OnChip_SetSwclk(b)    Set clock bit            0/1                 |
# `------------------------------------------------------------------------'

def OnChip_Pins(desc=""):
  v = mem32[DBGFORCE]
  s = "  {}".format(desc)            \
    + "  {}".format(Bin(v))          \
    + " AT={}".format((v >> AT) & 1) \
    + " CK={}".format((v >> CK) & 1) \
    + " TX={}".format((v >> TX) & 1) \
    + " RX={}".format((v >> RX) & 1)
  Debug(s)

def OnChip_SetAttach(b):
  if b == 0 : mem32[DBGFORCE] &= ~(1 << AT)
  else      : mem32[DBGFORCE] |=  (1 << AT)

def OnChip_SetSwdio(b):
  if b == 0 : mem32[DBGFORCE] &= ~(1 << TX)
  else      : mem32[DBGFORCE] |=  (1 << TX)

def OnChip_GetSwdio():
  return (mem32[DBGFORCE] >> RX) & 1

def OnChip_SetSwclk(b):
  if b == 0 : mem32[DBGFORCE] &= ~(1 << CK)
  else      : mem32[DBGFORCE] |=  (1 << CK)

# .------------------------------------------------------------------------.
# |     Lowest level off-chip debugging interface                          |
# |------------------------------------------------------------------------|
# |     OffChip_Pins()        Report GPIO values                           |
# |     OffChip_SetAttach(b)  Initialise pins          0/1                 |
# |     OffChip_SetSwdio(b)   Set data pin             0/1                 |
# |     OffChip_GetSwdio()    Read data pin            0/1                 |
# |     OffChip_SetSwclk(b)   Set clock pin            0/1                 |
# |     OffChip_SetReset(b)   Set reset pin            0/1                 |
# `------------------------------------------------------------------------'

def OffChip_Pins(desc=""):
  pass

def SetAttach(b):
  if b != 0:
    # Set SWDIO as output
    pass
    # Set SWCLK as output
    pass
    # Set RESET as output
    if RESET_PIN != None:
      pass
    # Set SWDIO low
    SetSwdio(0)
    # Set SWCLK low
    SetSwclk(0)
    # Issue a reset pulse
    SetReset(0)
    SetReset(1)
    SetReset(0)

def OffChip_SetSwdio(b):
  if b == 0 : mem32[GPIO_OUT_CLR] = 1 << SWDIO_PIN
  else      : mem32[GPIO_OUT_SET] = 1 << SWDIO_PIN

def OffChip_GetSwdio():
  return (mem32[GPIO_IN] >> SWDIO_PIN) & 1

def OffChip_SetSwclk(b):
  if b == 0 : mem32[GPIO_OUT_CLR] = 1 << SWCLK_PIN
  else      : mem32[GPIO_OUT_SET] = 1 << SWCLK_PIN

def OffChip_SetReset(b):
  if RESET_PIN != None:
    if b == 0 : mem32[GPIO_OUT_CLR] = 1 << RESET_PIN
    else      : mem32[GPIO_OUT_SET] = 1 << RESET_PIN

# .------------------------------------------------------------------------.
# |     Configure the low level interface                                  |
# `------------------------------------------------------------------------'

if ON_CHIP:

  Pins      = OnChip_Pins
  SetAttach = OnChip_SetAttach
  SetSwdio  = OnChip_SetSwdio
  GetSwdio  = OnChip_GetSwdio
  SetSwclk  = OnChip_SetSwclk

  AT        = PROC1_ATTACH
  TX        = PROC1_SWDI    # We send to its input
  RX        = PROC1_SWDO    # We read from its output
  CK        = PROC1_SWCLK

else:

  Pins      = OffChip_Pins
  SetAttach = OffChip_SetAttach
  SetSwdio  = OffChip_SetSwdio
  GetSwdio  = OffChip_GetSwdio
  SetSwclk  = Offchip_SetSwclk
  SetReset  = OffChip_SetReset

# .------------------------------------------------------------------------.
# |     Middle level interface                                             |
# |------------------------------------------------------------------------|
# |     Put(n, bits)    Clock a number bits out to SWD                     |
# |     Get(bits)       Clock a number of bits in from SWD                 |
# `------------------------------------------------------------------------'

parityBits = 0

def Put(n, bits=1):
  # Send a number of bits, lsb first, and keep track of the last 32 bits
  # sent so we can determine the parity later as needed
  global parityBits
  for cnt in range(bits):
  # Debug("  Put {}".format(n & 1))
    SetSwdio(n & 1)
    SetSwclk(1)
    SetSwclk(0)                 # 12345678
    parityBits = ((parityBits & 0x7FFFFFFF) << 1 ) | (n & 1)
    n >>= 1

def Get(bits=1):
  # Read a number of bits, lsb first
  SetSwdio(0)
  n = 0
  for cnt in range(bits):
    SetSwclk(1)
    SetSwclk(0)
    b = GetSwdio()
  # Debug("  Get {}".format(b))
    n = (n >> 1 ) | ( b << (bits-1))
  return n

# .------------------------------------------------------------------------.
# |     High level interface                                               |
# |------------------------------------------------------------------------|
# |     Rst()           Line Reset                                         |
# |     Swd()           Select SWD mode                                    |
# |     Cmd(...)        Send a command to the SWD DAP                      |
# |     Ack()           Read an Ack response                               |
# |     Dat()           Send or receive 32-bit data                        |
# |     Par()           Send or receive a parity bit                       |
# `------------------------------------------------------------------------'

def Rst():
  global DEBUG
  Debug(" Rst")
  DEBUG = -DEBUG
  Put(0xFFFFFFFF, 32)
  Put(   0xFFFFF, 20)
  Put(      0x00,  8)
  DEBUG = -DEBUG

def Swd():
  # To enable SWD
  #   8 bits high
  # 128 bits of magical incantation
  #   4 bits low
  #   8 bits to activate the SW-DP interface (0x1A)
  global DEBUG
  Debug(" Swd")
  DEBUG = -DEBUG
  Put(      0xFF,  8)
  Put(0x6209F392, 32)
  Put(0x86852D95, 32)
  Put(0xE3DDAFE9, 32)
  Put(0x19BC0EA2, 32)
  Put(         0,  4)
  Put(      0x1A,  8)
  DEBUG = -DEBUG

def Cmd(RnW, APnDP=None, adr=None, nam=""):
  if APnDP == None:
    reg   = RnW
    RnW   = reg[0]
    APnDP = reg[1]
    adr   = reg[2]
    nam   = reg[3]
  Put(0, 2)             # Idle
  Debug(" Cmd {} {} {} (0x{:1X})".format(["WR", "RD"][RnW],
                                         ["DP", "AP"][APnDP],
                                         nam, adr))
  Put(1)                # Start
  Put(APnDP)            # 1=AP, 0=DP
  Put(RnW)              # 1=RD, 0=WR
  Put((adr >> 2) & 1)   # adr[2]
  Put((adr >> 3) & 1)   # adr[3]
  Par(4)                # Parity of previous 4 bits
  Put(0)                # Stop
  Put(1)                # Park
  if False:
    txd = (((parityBits >> 0) & 1) << 7) | \
          (((parityBits >> 1) & 1) << 6) | \
          (((parityBits >> 2) & 1) << 5) | \
          (((parityBits >> 3) & 1) << 4) | \
          (((parityBits >> 4) & 1) << 3) | \
          (((parityBits >> 5) & 1) << 2) | \
          (((parityBits >> 6) & 1) << 1) | \
          (((parityBits >> 7) & 1) << 0)
    Debug(" Cmd as per swdloader.cpp = {:02x}".format(txd))

def Ack():
  ack = Get(3)
  s = "[{},{},{}] - ".format((ack >> 0) & 1, (ack >> 1) & 1,(ack >> 2) & 1)
  if   ack == 0b001 : s += "Okay"
  elif ack == 0b010 : s += "Wait"
  elif ack == 0b100 : s += "Fault"
  else              : s += "Unknown"
  Debug(" Ack {} - Bit order {}".format(Bin(ack, 3), s))
  Failed(ack != 0b001)
  return ack

def Dat(dat=None):
  if dat == None:
    # Read 32-bit data, but not parity
    dat = Get(32)
    Debug(" Get 0x{:08X}".format(dat))
    return dat
  else:
    # Send 32-bit data plus parity
    Debug(" Put 0x{:08X}".format(dat))
    Put(dat, 32)
    Par(32)             # Parity of previous 32 bits

def Par(bits=None):
  if bits == None:
    # Read parity bit
    par = Get(1)
  # Debug(" Par {}".format(par))
    return par
  else:
    # Send parity of last number of bits
    cpy = parityBits
    par = 0
    for cnt in range(bits):
      par ^= cpy & 1
      cpy >>= 1
  # Debug(" Par {}".format(par))
    Put(par)

# .------------------------------------------------------------------------.
# |     Highest level DP and AP interface                                  |
# |------------------------------------------------------------------------|
# |     WriteDap(...)   Write to a DP or AP register                       |
# |     ReadDap(...)    Read from a DP or AP register                      |
# |     Status()        Report the current status                          |
# `------------------------------------------------------------------------'

def WriteDap(cmd, dat):
  Debug("WriteDap {} {} {} <= 0x{:08X}".format(["WR", "RD"][cmd[0]],
                                               ["DP", "AP"][cmd[1]],
                                               cmd[3], dat))
  # Extra 0 here have no effect - also Get
  # Extra 1 break things
# Put(0, 8)
  Cmd(cmd)
  Ack()
  if failed : return
  # Two extra 0 here makes things work - Also Get
  # Two extra 1 also seem to work
  Get(2)
  Dat(dat)
  # Extra 0 here seem to have no effect - Also Get
  # Extra 1 break things
# Put(0, 8)
  Debug()

def ReadDap(cmd):
  Debug("ReadDap {} {} {}".format(["WR", "RD"][cmd[0]],
                                  ["DP", "AP"][cmd[1]],
                                   cmd[3]))
  # Extra 0 here have no effect - Also Get
  # Extra 1 break things
# Put(0, 8)
  Cmd(cmd)
  Ack()
  if failed : return
  dat = Dat()
  par = Par()
  # Extra 0 here have no effect - Also Get
  # Extra 1 break things
# Put(0, 8)
  Debug()
  return dat

def Status():
  Debug("Status")
  Cmd(RD_DP_STAT)
  Ack()
  if failed : return
  dat = Dat()
  par = Par()
  if dat == 0:
    Debug("Status = Okay")
  else:
    sts = []
    def StsBit(shr, msk, bits,  txt):
      if (dat >> shr) & msk:
        txt = bits + ":" + txt
        if msk == 1 : sts.append(txt)
        else        : sts.append(txt + "=" + str((dat >> shr) & msk))
    StsBit(31,   1,    "31", "CSYSPWRUPACK")
    StsBit(30,   1,    "30", "CSYSPWRUPREQ")
    StsBit(29,   1,    "29", "CDBGPWRUPACK")
    StsBit(28,   1,    "28", "CDBGPWRUPREQ")
    StsBit(27,   1,    "27", "CDBGRSTACK")
    StsBit(26,   1,    "26", "CDBGRSTREQ")
    StsBit(24,   3, "25-24", "RAZ/SBZP")
    StsBit(12, 255, "21-12", "TRNMODE")
    StsBit( 8,  15,  "11-8", "MAKLINE")
    StsBit( 7,   1,     "7", "WDATAERR")
    StsBit( 6,   1,     "6", "READOK")
    StsBit( 5,   1,     "5", "STICKYERR")
    StsBit( 4,   1,     "4", "STICKCMP")
    StsBit( 2,   3,   "3-2", "TRNMODE")
    StsBit( 1,   1,     "1", "SIICKYORUN")
    StsBit( 0,   1,     "0", "ORUNDETECT")
    Debug("Status = {}".format(", ".join(sts)))
  Debug()
  return dat

# .------------------------------------------------------------------------.
# |     Highest level memory interface                                     |
# |------------------------------------------------------------------------|
# |     SetAddress(adr) Set address                                        |
# |     Poke(...)       Write to an address within the chip                |
# |     Peek(...)       Read from an address within the chip               |
# `------------------------------------------------------------------------'

def SetAddress(adr):
  Debug("SetAddress 0x{:08X}".format(adr))
  Debug()
  WriteDap(WR_AP_TAR, adr)

def Poke(adr, dat=None):
  # Poke(dat)           Write to next address
  # Poke(-1, dat)       Write to next address
  # Poke(adr, dat)      Write to specified address
  if dat == None:
    adr, dat = -1, adr
  if isinstance(adr, int):
    if adr < 0 : Debug("Poke <= 0x{:08X}".format(dat))
    else       : Debug("Poke 0x{:08X} <= 0x{:08X}".format(adr, dat))
  else:
    adr, nam = adr
    Debug("Poke {} (0x{:08X}) <= 0x{:08X}".format(nam, adr, dat))
  Debug()
  if adr >= 0:
    WriteDap(WR_AP_TAR, adr)
    if failed : return
  WriteDap(WR_AP_DRW, dat)

def Peek(adr=-1):
  # Peek()              Read from next addres
  # Peek(-1)            Read from next address
  # Peek(adr)           Read from specified address
  if isinstance(adr, int):
    if adr < 0 : Debug("Peek")
    else       : Debug("Peek 0x{:08X}".format(adr))
  else:
    adr, nam = adr
    Debug("Peek {} (0x{:08X})".format(nam, adr))
  Debug()
  if adr >= 0:
    WriteDap(WR_AP_TAR, adr)
    if failed : return
  ReadDap(RD_AP_DRW)
  if failed : return
  dat = ReadDap(RD_DP_RDBUF)
  return dat

# **************************************************************************
# *                                                                        *
# *     Let's get debugging !                                              *
# *                                                                        *
# **************************************************************************

def Main():

  Debug("-Initialising")

  # We need to attach to the on-chip debugger or reset an off-chip target

  SetAttach(1)

  # Now enter the SWD debugging mode

  Debug("Entering SWD mode")
  Swd()
  Rst()
  Debug()

  # .----------------------------------------------------------------------.
  # |   Select the target core                                             |
  # `----------------------------------------------------------------------'

  Debug("-Select Target Core 1")

  # Each DAP will only respond to debug commands if correctly addressed
  # by a WR_DP_TGTSEL target select command -

  CORE_0 = 0x01002927
  CORE_1 = 0x11002927
  RESCUE = 0xF1002927

  Debug("WR DP TGTSEL")
  Cmd(WR_DP_TGTSEL)
  n = Get(5)
  Debug( " Ign {}".format(Bin(n,5)))
  Dat(CORE_1)
  Debug()

  # .----------------------------------------------------------------------.
  # |   Read IDCODE - Should be 0x0BC12477                                 |
  # `----------------------------------------------------------------------'

  Debug("-Read IDCODE - Should be 0x0BC12477")

  if False:
    # The long way - Useful for debugging issues
    for tries in range(1):
      Debug("RD DP IDCODE")
      Cmd(RD_DP_IDCODE)
      Ack()
      if failed : return
      dat = Dat()
      par = Par()
      Debug()
      if failed : return
      if dat == 0x0BC12477 : Debug("Okay - Identified chip as RP2040")
      else                 : Debug("Fail - Should be 0x0BC12477")
      Debug()
      if Failed(dat != 0xBC12477) : return

  if True:
    # The short way - Proves ReadDap works
    for tries in range(1):
      dat = ReadDap(RD_DP_IDCODE)
      if failed : return
      if dat == 0x0BC12477 : Debug("Okay - Identified chip as RP2040")
      else                 : Debug("Fail")
      Debug()
      if Failed(dat != 0x0BC12477) : return

  # .----------------------------------------------------------------------.
  # |   Read the DLPIDR - Should be 0x10000001                             |
  # `----------------------------------------------------------------------'

  Debug("-Read DLPIDR - Should be 0x10000001")

  WriteDap(WR_DP_SELECT, 0x3)
  if failed : return

  dat = ReadDap(RD_DP_STAT)
  if failed : return

  if dat == 0x10000001 : Debug("Okay - DLPIDR = 0x{:08X}".format(dat))
  else                 : Debug("Fail - DLPIDR = 0x{:08X}".format(dat))
  Debug()
  if Failed(dat != 0x10000001) : return

  # .----------------------------------------------------------------------.
  # |   Clearing any bus errors                                            |
  # `----------------------------------------------------------------------'

  Debug("-Clearing any bus errors")

  WriteDap(WR_DP_SELECT, 0)
  if failed : return

  WriteDap(WR_DP_ABORT, (1 << DP_ABORT_STKCMPCLR_SHL ) |
                        (1 << DP_ABORT_STKERRCLR_SHL ) |
                        (1 << DP_ABORT_WDERRCLR_SHL  ) |
                        (1 << DP_ABORT_ORUNERRCLR_SHL))
  if failed : return

  WriteDap(WR_DP_SELECT, 0)
  if failed : return

  WriteDap(WR_DP_CTRL, (1 << DP_STAT_ORUNDETECT_SHL  ) |
                       (1 << DP_STAT_STICKYERR_SHL   ) |
                       (1 << DP_STAT_CDBGPWRUPREQ_SHL) |
                       (1 << DP_STAT_CSYSPWRUPREQ_SHL))
  if failed : return

  dat = ReadDap(RD_DP_STAT)
  if failed : return

  def BitCheck(nam, dat, shr, want):
    n = (dat >> shr) & 1
    if n == want : s = "Okay"
    else         : s = "Fail"
    Debug("{} - {} = {}".format(s, nam, n))
    Failed(n != want)

  BitCheck("CDBGPWRUPACK", dat, DP_STAT_CDBGPWRUPACK_SHL, 1)
  BitCheck("CSYSPWRUPACK", dat, DP_STAT_CSYSPWRUPACK_SHL, 1)
  Debug()
  if failed : return

  # .----------------------------------------------------------------------.
  # |   Halt and configure the the core                                    |
  # `----------------------------------------------------------------------'

  # See : http://markding.github.io/swd_programing_sram

  Debug("-Halt and configure the core")

  WriteDap(WR_AP_CSW, (AP_CSW_SIZE_32BITS     << AP_CSW_SIZE_SHL     ) |
                      (AP_CSW_ADDR_INC_SINGLE << AP_CSW_ADDR_INC_SHL ) |
                      (AP_CSW_DEVICE_EN       << AP_CSW_DEVICE_EN_SHL) |
                      (AP_CSW_PROT_DEFAULT    << AP_CSW_PROT_SHL     ) |
                      (AP_CSW_DBG_SW_EN       << AP_CSW_DBG_SW_EN_SHL))
  if failed : return

  Poke(DHCSR,         (DHCSR_C_DEBUG_EN << DHCSR_C_DEBUG_EN_SHL) |
                      (DHCSR_C_HALT     << DHCSR_C_HALT_SHL    ) |
                      (DHCSR_DBGKEY_KEY << DHCSR_DBGKEY_SHL    ))
  if failed : return

  # .----------------------------------------------------------------------.
  # |   Define memory area to access                                       |
  # `----------------------------------------------------------------------'

  @micropython.asm_thumb        # Returns address of bytearray
  def AddressOf(r0):
    nop()

  size = 4                      # 4 x 32-bit words

  ba = bytearray()
  for n in range(size * 4):
    ba.append(n + 1)

  base = AddressOf(ba)

  # .----------------------------------------------------------------------.
  # |   Read initial memory                                                |
  # `----------------------------------------------------------------------'

  Debug("-Read initial memory")

  SetAddress(base)              # Starting address
  if failed : return

  lst = []
  for offset in range(size):
    dat = Peek()                # Read next in sequence
    if failed : return
    lst.append(dat)

  print("Initial memory contents")
  print("")
  for offset in range(size):
    adr = base + (offset * 4)
    print("{:08X} : {:08X} {:08X}".format(adr, mem32[adr], lst[offset]))
  print("")

  # .----------------------------------------------------------------------.
  # |   Write memory                                                       |
  # `----------------------------------------------------------------------'

  Debug("-Update memory")

  SetAddress(base)              # Starting address
  if failed : return

  dat = 0x12345678
  for offset in range(size):
    Poke(dat)                   # Write to next in sequence
    if failed : return
    dat += 0x01010101

  # .----------------------------------------------------------------------.
  # |   Read updated memory                                                |
  # `----------------------------------------------------------------------'

  Debug("-Read updated memory")

  SetAddress(base)              # Starting address
  if failed : return

  lst = []
  for offset in range(size):
    dat = Peek()                # Read next in sequence
    if failed : return
    lst.append(dat)

  print("Updated memory contents")
  print("")
  for offset in range(size):
    adr = base + (offset * 4)
    print("{:08X} : {:08X} {:08X}".format(adr, mem32[adr], lst[offset]))
  print("")

  # .----------------------------------------------------------------------.
  # |   Final status                                                       |
  # `----------------------------------------------------------------------'

  if False:
    Debug("-Final status")
    Status()

Main()

Code: Select all

.----------------------------------------------------------------------------.
|    Initialising                                                            |
`----------------------------------------------------------------------------'

Entering SWD mode
 Swd
 Rst

.----------------------------------------------------------------------------.
|    Select Target Core 1                                                    |
`----------------------------------------------------------------------------'

WR DP TGTSEL
 Cmd WR DP TGTSEL (0xC)
 Ign 00001
 Put 0x11002927

.----------------------------------------------------------------------------.
|    Read IDCODE - Should be 0x0BC12477                                      |
`----------------------------------------------------------------------------'

ReadDap RD DP IDCODE
 Cmd RD DP IDCODE (0x0)
 Ack 001 - Bit order [1,0,0] - Okay
 Get 0x0BC12477

Okay - Identified chip as RP2040

.----------------------------------------------------------------------------.
|    Read DLPIDR - Should be 0x10000001                                      |
`----------------------------------------------------------------------------'

WriteDap WR DP SELECT <= 0x00000003
 Cmd WR DP SELECT (0x8)
 Ack 001 - Bit order [1,0,0] - Okay
 Put 0x00000003

ReadDap RD DP STAT
 Cmd RD DP STAT (0x4)
 Ack 001 - Bit order [1,0,0] - Okay
 Get 0x10000001

Okay - DLPIDR = 0x10000001

.----------------------------------------------------------------------------.
|    Clearing any bus errors                                                 |
`----------------------------------------------------------------------------'

WriteDap WR DP SELECT <= 0x00000000
 Cmd WR DP SELECT (0x8)
 Ack 001 - Bit order [1,0,0] - Okay
 Put 0x00000000

WriteDap WR DP ABORT <= 0x0000001E
 Cmd WR DP ABORT (0x0)
 Ack 001 - Bit order [1,0,0] - Okay
 Put 0x0000001E

WriteDap WR DP SELECT <= 0x00000000
 Cmd WR DP SELECT (0x8)
 Ack 001 - Bit order [1,0,0] - Okay
 Put 0x00000000

WriteDap WR DP CTRL <= 0x50000021
 Cmd WR DP CTRL (0x4)
 Ack 001 - Bit order [1,0,0] - Okay
 Put 0x50000021

ReadDap RD DP STAT
 Cmd RD DP STAT (0x4)
 Ack 001 - Bit order [1,0,0] - Okay
 Get 0xF0000001

Okay - CDBGPWRUPACK = 1
Okay - CSYSPWRUPACK = 1
My code used to work but the not stock MicroPython firmware I am currently using hangs soon after that. That's likely an issue with my firmware rather than the code.

fanoush
Posts: 1048
Joined: Mon Feb 27, 2012 2:37 pm

Re: SWD timing

Thu Jun 01, 2023 12:25 pm

FruityNutPi wrote:
Wed May 31, 2023 6:21 pm
On reflection, I think maybe your comment answers that. Data set first then pulse clock with enough delay to allow the data to settle on the line. At least, I think that's what I am reading here?
Yes

Also when checking your images I forgot whether the clock pulse is high or low when the data needs to be valid but from my code that worked https://github.com/fanoush/EspruinoBoar ... KV2/swd.js is seems that it is low. When reading/writing individual bits I pulse clock low. And for reading 32 bits I use SPI mode 2 (polarity 1 = idle state of the clock signal is high) and that works too and for writing 8 and 32bit sized values I used shiftOut method that also by default has idle clock high and pulses low to send data. That worked well too. In fact by some coincidence I never had any issues with SWD clock timing both when using high level language and native SPI and shiftOut implementation.

FruityNutPi
Posts: 28
Joined: Sun Jun 17, 2018 1:26 pm

Re: SWD timing

Thu Jun 01, 2023 5:41 pm

hippy wrote:
Wed May 31, 2023 8:12 pm
FruityNutPi wrote:
Wed May 31, 2023 6:21 pm
I had a for loop in there to generate 52 pulses (50 being the minimum according to the standard)
That triggered a recollection that there is a requirement for two high bits to have been clocked in before the hardware will recognise the start bit of an SWD command. I simply clocked out two high bits prior to sending the start bit.
Yes, there is a need for two idle bits, although the diagram seems to show the line driven low? The IDR command after a reset doesn't work if they are not there. Conversely, switching JTAG to SWD (where required) and vice versa does not require them and fails if they are present, so the 50 cycles before and after do not appear to be a reset sequence as I initially had imagined...

Image
hippy wrote:
Wed May 31, 2023 8:12 pm
In your 'trace-mine-wselect-02.png' there's no high for the last two bits as there is in the other images. Maybe ... ?
Yup, they seem to be there after a read operation but not after a write. The picoreg output does not have them either....
hippy wrote:
Wed May 31, 2023 8:12 pm
Getting the clocking right is what I struggled with as it's a case of 'get it wrong and it doesn't work' and it's not easy to tell why it hasn't.
No argument from me there. Thank you for the code snippets. I will have a look.
fanoush wrote:
Thu Jun 01, 2023 12:25 pm
Also when checking your images I forgot whether the clock pulse is high or low when the data needs to be valid but from my code that worked https://github.com/fanoush/EspruinoBoar ... KV2/swd.js is seems that it is low. When reading/writing individual bits I pulse clock low. And for reading 32 bits I use SPI mode 2 (polarity 1 = idle state of the clock signal is high) and that works too and for writing 8 and 32bit sized values I used shiftOut method that also by default has idle clock high and pulses low to send data. That worked well too. In fact by some coincidence I never had any issues with SWD clock timing both when using high level language and native SPI and shiftOut implementation.
That's also interesting. I Haven't tried using SPI mode for transmitting and reading the SWD data. SPI presumably automatically handles the clock. The pico seems to have a number of SPI pin pairs that could be used, although aren't rx and tx on separate pins?
Attachments
swd_idcode_sequence.png
swd_idcode_sequence.png (28.58 KiB) Viewed 1805 times

hippy
Posts: 14352
Joined: Fri Sep 09, 2011 10:34 pm
Location: UK

Re: SWD timing

Thu Jun 01, 2023 7:08 pm

FruityNutPi wrote:
Thu Jun 01, 2023 5:41 pm
Yes, there is a need for two idle bits, although the diagram seems to show the line driven low?
Yes, that was my faulty recollection - It's been a while since I was on that adventure !
FruityNutPi wrote:
Thu Jun 01, 2023 5:41 pm
I Haven't tried using SPI mode for transmitting and reading the SWD data.
I had speculated SPI could be used to package up the bits and thereby increase transmission speed so it's good to hear someone has made that work.

I would recommend getting it to work bit-banged first but if you have ready-rolled, tried and tested, SPI code there's no reason not to use that.

Return to “Bare metal, Assembly language”