HUMS2023 Data Challenge
Questions and Answers

HUMS2023 HOME

HUMS2023 DATA CHALLENGE

TERMS & CONDITIONS
(including Data Challenge registration)

  1. Q: How do I submit a question?

    A: Send an email to DSTG.HUMSConference@defence.gov.au, questions will be responded to on this page.

  2. Q: Is there a H5 data format available?

    A: No. You may convert .mat files to .h5 in Matlab, or Python should be able to read .mat format.

  3. Q: I'm having trouble with download links.

    A: UPDATED This was provided as part of the registration process. email if you have trouble after trying again.

  4. Q: My dataset got corrupted. Is there a repair option?

    A: Yes. While it's quicker for most to re-download 6.6GB, we understand challenging internet connectivity (price, speed, reliability), so we have PAR2 recovery files available on request. First run the verification file in your preferred PAR2 software (e.g. MultiPar, or MacPAR deLuxe), then email us with the number of repair required.

  5. Q: Is the sampling frequency hidden intentionally, or it is just missed?

    A: The original sampling rate for the raw data was 65573.77049180328 Hz which was determined by NI-DAQ board, but it is not relevant with the averaged data. So the resampling rate for the hunting tooth average is 405405 samples per 99 revolutions of planet gear or per 35 revolutions of the planet carrier.

  6. Q: I did not understand (in 01_Notes_Navg_Hunting_for_SSA.txt) what is meant by:
    • "So, Navg_hunting = 12 hunting tooth periods", and
    • "over the 100-sec data acquisition period"
    it is the time duration of each file?

    A: The 100-sec data acquisition period contained over 12 hunting tooth periods which were used to obtain the hunting tooth averages. The hunting tooth averaged data is no longer in the time domain, rather, it is in the angle domain.

  7. Q: Quoting the description pdf "only the hunting-tooth synchronous averages of the planet/ring gears": What is the averaging performed in respect to?

    A: We suggest finding a reference paper in the literature regarding the hunting tooth average. It is calculated with respect to the rotation of both the planet gears and planet carrier over 99 revolutions of the planet gears or 35 revolutions of the carrier.

  8. Q: Can you provide more information regarding load cycles' association with the file numbers?

    A: Yes, please refer to the note here.

  9. Q: For earliest convincing detection of a planet gear crack, what is the threshold for the fault detection? The threshold here holds a balance between false alarm and probability of detection. In other words, what is generally an acceptable false alarm rate for this problem?

    A: As this is a fault detection problem in aviation where the rate of missed detection should be kept low, you may use your own judgement based on the general practice in aviation industry to choose this threshold. HUMS2023 committee would not give such a recommendation.

  10. Q: What is the due date for submission of the HUMS2023 Data Challenge results?

    A: UPDATEDThe due date has been extended until 11:59 AEST on 14th November 2022.

  11. Q: What do we need to submit to the data challenge, algorithm or result?

    A: UPDATEDYou will need to submit both the results and the algorithms used to produce the results. If you have tried to use multiple algorithms, you can submit the best set of results and the algorithm behind the results.
    HUMS2023 committee has developed a format, and has sent a link to registered challengers.

  12. Q: How can we submit the submission?

    A: UPDATEDThe webpage for the submission is now live. The link to the submission system with instructions how to login is sent to all registered challengers on 1st October 2022. The page includes the required template for results.

  13. Q: How do you rate our models for the earliest convincing detection and the most accurate trending, and what is the metric to be used for the model evaluation?

    A: It is unlikely that HUMS2023 committee will be able to use a unique kind of metric to evaluate the results because the methods to be used for this Data Challenge are widely open, e.g. physics-based, signal processing based and purely data-based methodologies, etc.. For the convincing detection, we are looking for detections of physically explainable characteristic changes caused by the crack in multiple channels of the dataset. For most accurate trending, we are going to compare the trending curve with our estimated crack growth curve produced by our fractography analysis. However, we'd expect the evaluation process would be a mixture of objective assessment with some subjective views by the experts in the fields of machine condition monitoring and mechanical fault diagnosis/prognosis.

  14. Q: What are the dimensions of the gearbox components and what materials are used to make it?

    A: The most comprehensive dimension details for the gearbox can be found in this NASA report. The material for the planet gear is AISI 9310 Alloy Steel.

  15. Q: The files that were downloaded contain names that end with a numerical series. Is this numerical series the time of day that the file was collected? If yes, then it appears that files were normally created every 3 minutes.

    A: Yes, the numerical series in file names show the date and time when the raw vibration data file was recorded at an interval of 3 minutes with 125 percent torque in one load cycle.

  16. Q: There are some gaps in the 3 minute interval (such as Day021_Hunting_SSA_20211208_155040 to Day021_Hunting_SSA_20211208_160102). Does this mean that there is missing data? or does this mean that the files were saved automatically once a row limit (405405) was reached? If the row limit is the way for saving the data, then is there a record of stops and starts that can account for the time differences? This will contribute to the noise in the data and will need to be identified.

    A: No, there was no missing data nor row limit. The length of 405405 is not the raw data length but data length of the hunting tooth averages. There were gaps in between two files larger than 3 minutes (e.g. just over 10 minutes), which means the two files were recorded at different load cycles. Please refer to the data description document for more detail about load cycles. The automatic recording for each load cycle was triggered by the 125 percent torque level.

  17. Q: The data has been collected across multiple days and the timestamps on the end of the files suggests that the gearbox was stopped and restarted each evening and morning (stopped around 16:00ish each day and started around 10:00 each morning). Does this mean that the starts and stops for the gear box were clipped from the data? or were they left in? looking to account for noise here.

    A: Yes, the gearbox was running continuously within each test date for most dates, there were one or two dates where the test had a lunchtime stoppage. Our understanding is that the stop-start process would affect the amplitude of vibration signatures due to variations in lubrication oil temperature.

  18. Q: Do other team members need to register?

    A: UPDATEDWhile team members will need to agree to the same Terms and Conditions, a single registered team member will suffice. They will need to report who their team members are when the results are submitted. Note, as of 30 September 2022, registration is closed to new teams.

  19. Q: Can you please confirm that there are 526 datasets, with 405405 sample points in each, and each one is collected in 100 seconds? The reason I’m asking this is to visualize the data based on time, and not the revolutions.

    A: There are 526 data files in the dataset, each file has 4 channels of averaged (not raw) vibration data, each channel contains 405405 data points. Since the vibration data were synchronously averaged (i.e. hunting tooth average), the data were resampled based on shaft angle (or revolutions) and not based on time. The raw vibration data were sampled based on time. Please also refer to the answers to some other questions, e.g. Q/A #5, #6, #7 and #16.

  20. Q: If I know that each test (each of 526 data files, with 4 channels in each) is 100 seconds, then I can reshape my graphs to show the time in the axis instead of the revolutions. This way, I can show the time that the crack initiated-at the moment, my graph shows when the crack initiated on the revolution axis.

    A: I can give an equivalent time for the 405405 data points.
    The data length for each data file is not 100 seconds, but is equivalent to about 6.1 seconds (35 revolutions of the output shaft, so 35 / 5.73333 = 6.1) worth of data based on the nominal output shaft speed of 5.73333 Hz.

  21. Q: When is the last chance to join the HUMS2023 Data Challenge?

    A: UPDATED Due to scheduling reasons, we need to know the total number of participants to the Data Challenge, so we will close the registry of Data Challenge by the end of September 2022. As of midnight AEST 30 September 2022, new teams cannot be registered.

  22. Q: Is this a supervised or an unsupervised problem?

    A: Your team can decide if it's a supervised or unsupervised problem. However, it's probably more like an anomaly detection problem where anomalies are expected when gear fault (crack) occurs.

  23. Q: We are required to submit the results and algorithms. So what are the results (for the convincing detection and most accurate trending). Are there 2 trending plots?

    A: For earliest convincing detection, you need to show with which file number you can start to detect the gear fault and why. For most accurate trending, you will need to show your trending curve/plot. Also, please refer to Q/A #13. More details about the submission will be given in the website soon.

  24. Q: Is there a link between the load cycle numbers and the file numbers?

    A: Click here for the .mat file that shows the linkage. This will be useful in submitting the results.

  25. Q: In Q/A #17, you mentioned that the vibration amplitude could be affected by oil temperature. Is there any oil temperature data available?

    A: Yes. We recorded some lube oil temperature data in our test log book (hand written). Please see the .mat file for the data with some explanation notes and .fig file for the temperature plot with the ambient temperatures (e.g. min~max) of each test day.

  26. Q: Do you have any data for shaft speed and torque load?

    A: Yes, we have averaged input shaft speed (in %) and input or turbine torque (TT in %) for each data file available here.

  27. Q: Where is the template and link to submit results?

    A: The submission link, along with login details is sent to all registered members on 01 October 2022. The template is available to download from the submission page.

  28. Q: Can I still register for the challenge and get the data?

    A: 1. Judging the challenge submissions is a considerable effort, and we also need to finalize the presentation schedule for HUMS2023. So we needed to close registrations in September 2022. As registration is now closed, no new teams will be accepted.
    UPDATE: The Helicopter Main Rotor Gearbox Planet Gear Fatigue Crack Propagation Test Data was announced to be released on 12 July 2023!
    This includes more data than the original HUMS2023 Data Challenge dataset, and it is now conditionally available to everyone. For full details see https://www.dst.defence.gov.au/our-technologies/helicopter-main-rotor-gearbox-planet-gear-fatigue-crack-propagation-test (opens in new tab).

  29. Q: Emails are bouncing, is there problem?

    A: For emails we send, check your Junk/Spam folder, and add DSTG.HUMSConference@defence.gov.au and DSTG.HUMSConference@defence.gov.au to your address book, and flag incoming correspondence as not spam. All registered emails should have received log-in details for the submission system already.
    There was a known outage 7th Oct 2022, but our upstream hosting provider resolved this overnight.

  30. Q: Was there an oil change in the last 60 cycles?

    A: No the oil was not changed in the last 60 cycles.

  31. Q: About the axis/direction of the sensors: is there any Cartesian equivalent for the axis/direction of the accelerometer sensors, e.g., 'IP1' being on 'X' axis, 'RF2' being on 'Y' axis, etc.?

    A: IP1 sensor is close to vertically placed to input shaft, the other 3 are all radially placed to output (or planet carrier) shaft. There is no X and Y reference direction.

  32. Q: Some of the mean values are negative. What can negative sensor readings mean? Is this related to direction?

    A: The negative mean values for the 3 channels by RF, RS and RR sensors are most likely caused by errors in data acquisition system's AC coupling configuration which should have removed the DC components (or mean values) in vibration signals. However, these mean values carry no information about the characteristic vibration of the gearbox.

  33. Q: Can I have an extension?

    A: UPDATEDConsidering the number of requests for an extension of the due date for result submission, we decide to extend the due date for result submissions by nearly two weeks until midday of 14th Nov 2022.
    Further extension only available by explicit arrangement.

  34. Q: How to we present results?

    A: UPDATEDInstructions were sent your team's primary email address. The presentations will need to be submitted no later than 23:59 AEDT 20th February 2023

  35. Q: Did I win?

    A: The HUMS2023 Data Challenge awards will be announced at the HUMS Dinner.

Updated 2024.05.09 15:15 Melbourne