LBA Hb

Pre-setup things to do or check before setting up for LBA observations:

  1. Make sure you are part of the LBA chat on Mattermost. Any sudden change in scheduling for experiment or any other information will be on Mattermost. You can access that by pointing a browser at https://chat.atnf.csiro.au/. If you don't have a personal account you can login as hbobserver@gmail.com (usual observer password). Keep an eye on messages here. The other LBA observers at different stations are a good first port of call for problems at Hobart/Ceduna. Chris Phillips is the main LBA coordinator at the ATNF. Jamie Stevens also knows a lot about the systems here. There are also Mattermost apps for smartphones etc if you want to use them.
  2. Check the information email for the session. If in any doubt you can cross-check against the information for the experiment at ATNF LBA wiki

Preparing schedule files

Instructions on preparing LBA schedule files for use at Hobart 12m are discussed here:

  1. First of all, Access Hb vnc by vncviewer pcfs-2hb:1 from ops8. You can also use 'vnc://pcfs-2hb.phys.utas.edu.au:5901' from an inbuilt mac vnc. you'll need UTAS VPN to connect to the vncs if your are observing outside of UTAS network.
  2. Now find out the experiment name and code which you are going to setup for (eg. v255ah). Find out experiments start time, end time in UTC format. This info can be found either on UTAS observing wiki page ra-wiki page, or offical LBA website page ATNF LBA wiki
  3. Your next job now is to make sure that proc file and snp file the particular experiment are already there or not. From oper@pcfs-2hb, go to /usr2/proc and /usr2/sched folders to check whether these files are there or not. e.g for exp v255ah, i'll go (from oper@pcfs-2hb):
    cd /usr2/proc/
    ls v255ahhb.prc
    cd /usr2/sched/
    ls v255ahhb.snp

    Most probably the files will be drudged and checked before the session/exp starts by the main UTAS LBA support person. so, if these files are there in their relevant folders, then it's ok. If they are not there, then you need to download and drudge the exp to make these files.

How to drudge

NOTE: For Hb and KE, at the moment, proc files made from drudge does not have right setup info. So, Most probably the main UTAS LBA support person will make these proc files already before the start of the session and you will find them in proc folder. But in case if they are not there, contact the on-call person.

Everything is done from oper@pcfs-2hb. Experiment vex file will be in the home area or in /usr/sched/ directory. NOTE: Even for KE, you need to drudge and make proc and snp file over here at pcfs-2hb. KE drudge doesn't work at the moment. After drudging the files for KE, you need to manually copy them to relevant directories on pcfs-2ke. (eg, v255ahke.prc in /usr2/proc/ and v255ahke.snp & v255ahke.sum files in /usr2/sched/)

Now to drudg, run:

from oper@pcfs-2hb

drudg /usr2/sched/v255ah.vex
hb (choose antenna)
11 (show/set equipment type)
19 16 1 1 (inputs for equipment type)
3 (make a .snp file)
5 (print a summary file)
12 (make a proc file) (again remember, it is not working for ke and hb for the moment, so skip making proc file for the moment being)
0 (done with drudge)

Now move the proc file to /us2/proc/ directory
mv /usr2/sched/v255ahhb.prc /usr2/proc/v255ahhb.prc

After this is done, make sure you have proc and snp file in their relevant directories on pcfs-2hb (eg, v255ahhb.prc in /usr2/proc/ and v255ahhb.snp & v255ahhb.sum files in /usr2/sched/)

Pre-checkups before setting up for the experiment

  1. Hobart12 field system is fs-lke. If you are setup for a new exp from the start, close the field system which is running on all pcfs machine ((e.g on pcfshb and/or pcfs-2hb) by typing 'terminate' in Oprin window, and start the right field system by typing 'fs-lke' in field system window (or any window on oper@pcfs-2hb). But if there is already an exp going on and you are doing a change-over setup, then you don't need to kill the fs (because most prob right field system is running)
  2. Now you should check whether jive is running or not. For this from oper@pcfs-2hb, do ssh observer@flexbuflke. Now, check jive by doing:
    ps -ef | grep -i jive
    if it returns jive5ab-3.1.0-64bit-Release, then it's running and good. but if jive is not running, then from observer@flexbuflke:
    StartJ5
    Which automatically runs jive5ab-3.1.0-64bit-Release and logs the debug information into a log file under the /home/observer/jive5ab.logs/. That is convenient if we need to review what went wrong on a session. if that doesn't work, you can also start the jive5ab as:
    jive5ab-3.1.0-64bit-Release
  3. Make sure that dbbc is running by doing vncviewer dbbc3hb from oper@pcf-2hb. Check the right version of dbbc is open (DBBC3 Control DDC_V_v124.exe). if not, close the one that is currently open and open the right dbbc version. it should be somewhere in the middle of desktop.
  4. Ensure RfSwOnForS is turned ON at pduhb and ipske. This can be done by one of the Internet power switches. For doing it for for Hobart12, go to pduhb website, and for for Katherine, go to ipske website . Look for the port labelled as RfOnForS and make sure that it is turned ON.
  5. Now, look at the ddbc3 stats (power levels and delays). In the event of the DBBC3 samplers going bad or losing sync (entries are in red), the best remedy is a full power cycle. NOTE : Remember that we are using Channels A, B, E, and F for dbbc3 Hb and Channels A, B, C, and D for dbbc3 Ke. So we only care for those channels in those respective dbbc3. we only need to make sure that stats for those channels are right.

    To do this, first close the dbbc version running on dbbc3 desktop, then shut down the dbbc3 Windows PC through the VNC interface. Then switch off the UPS powering the DBBC3 by doing following:

    From oper@pcsf-2hb, go
    ssh observer@weather
    su
    [Password] Hint: Not the usual password
    upscmd CyberPower load.off
    admin
    [Password] Same password as the upper one

    Now wait for 30 seconds. Then:

    upscmd CyberPower load.on
    admin
    [Password]

    Now, wait for it to boot and then restart the server. it takes around 2-3 minutes. then again open up the vnc of dbbc3hb by vncviewer dbbc3hb. you'll need Password for that, which you can find either on this ra-wiki page or you need to ask to someone. it's same password for both Hb and Ke dbbc3.

    Now, start the right dbbc verson once you are inside dbbc3 desktop, and hit 'yes' to reconfigure.

    It takes around 20-30 minutes for the system to fully configure. This usually fixes the problems for the rest of the session but isn't 100% effective. but if it doesn't, do the same process again, and again, and again until it fixes the issue. doesn't matter how many tries it takes.

Loading the procedure files

Now we want to load in the procedure files. Do following:

proc=v255ahhb
exper_initi
setup01
antenna=open@!,30s
iread
bread
clkoff
maserdelay

dbbc3=pps_delay # should report 43/39 ns for all the DBBC3 Cores. If not, you need to reconiif the dbbc3
mk5=set_disks? # should give 34 for flexbuflke
mk5=rtime? # to check how much free space you have. if free space is less than 4-5%, let your on-call person know about it.

Now, do a scan test recording by doing following:

scan_name=test,v255ah,hb
disk_record=on
mk5=evlbi?
mk5=evlbi?
mk5=evlbi?\\ disk_record=off

check that output number in fs keeps increasing each time when you run mk5=evlbi? and if we see that output number in fs keeps increasing each time, that means that data is being recorded.
Now do,

scan_check # this should show the latest recorded scan in field system.

also, from observer@flexbuflke, do:
vbs_ls -lrth 'v255ah*' | tail -10

this will show your recorded scan the right name. this check will confirm that data is being recorded at right place.

after all above checks are done, you can start the esp by doing following:
schedule=v255ahhb,#1
onsource # antenna should be tracking or slewing and then tracking to the source.
list # this will show you the next upcoming scan details in fs

OR if late start, do:
schedule=v255ahhb,2021.210.03:00:00 (any time particular time from where you want to start the schedule, hint: time right now)

Monitoring:

  1. Make sure that antenna is tracking and slewing to next source as well, and not struck. this can be done by doing onsource command in Oprin window.
  2. Make sure that data is being recorded by running watch -n60 "vbs_ls -lrth 'v255ah*' | tail -30" on observer@flexbuflke to check whether all scans are being recorded or not. also, run mk5=evlbi? multiple times in oprin window which scan recording is going on to check whether data is actually being recorded or not (the number in fs should increase every time)
  3. Check the delays and samplers are good. if not, you need to halt the schedule and reconfig the dbbc3.
  4. Monitor for wind stows, should be obvious if you are monitoring the antenna moving. make sure that wind is ok from antenna_monitor window.
  5. Eyes out for ERRORS in the field system log. There are a few benign ones that will regularly come up.
  6. Check how much space is left in the disk to record by doing mk5=rtime? , to check how much free space you have. if free space is less than 4-5%, let your on-call person know about it.
  7. Record experiment details to the appropriate ATNF wiki page. (Go here, click the session and then experiment number link, scroll down to "observing comments for each antenna" section and click Hb to update the Hobart 12 log and Cd to update the Ceduna log.) The login username and password can be found here.
  8. These web pages should always be open:
    Grafana monitoring Page
    ATNF VLBI monitoring page
    ATNF recorder monitoring page
    Local Mt Pleasant live page
    Local Ceduna live page

Also, it is good to know how to halt and continue the setup during the experiment, in case if you need to do it during your observations. To halt the setup, do following from Oprin window:

halt
disk_record=off
and then to continue the setup, do following from oprin window:
setup01
cont
list

few points for observers:

  1. Check every 30 mins during day shift and every 1 hour during night shifts
  2. Always try to use vnc for monitoring
  3. Also can use Grafana to do checks as well but vncs are more relaible
  4. Make sure that you atleast check once every 30 mins that everything is running smooth or not
  5. Always have LBA mattermost chat open when you are on observing duty so other people can interact with you

Trouble-shooting some common errors and probelms:

stuck antenna:

If antenna is stuck, and you see following errors in fs window:
2012.297.03:58:54.01?ERROR st -998 reading SystemClock1
2012.297.03:58:54.01?ERROR st -999 TCP/IP connection was closed by remote peer
2012.297.03:58:54.01?ERROR st -5 Error return from antenna, see Mbus error

OR

2012.297.03:58:54.01?ERROR st -998 antenna stuck

then you can do following things:
first, in Oprin window, type:

antenna=open
antenna=operate
antenna=status

If this doesn't solve the issue, then:

  1. Halt the schedule using step described earlier on this page
  2. Go to timehb vnc by doing vncviewer timehb

    then, over there:

    Go to the antenna control window. Look for red buttons there. turn off the antenna manually by doing from operate to standby. give it sometime to catch up after changing. then switch Drives off.
    wait some time, then switch Drives back to on. After turning the drives on, it is recommended to wait a little while before putting the antenna in operate

    This step is crucial if the previous steps haven't resolved the issue

    This will generally bring back the antenna form stuck. but if this doesn't solve it, do the RESETS at the bottom. Basically at the end, we want green buttons at operate, drives on in antenna control window.

    Note: sometimes, when antenna goes to windstow, it does not come back automatically after wind becomes fine. for that, you need to halt the schedule, reboot the timehb machine using the ipswitch website, then login again to timehb machine and open all the programs that were open before and make sure that everything is all right. Then terminate the field system, start it again and do a quick re-setup. this will solve the issue.