Difference between revisions of "Workdocumentation 2021-03-30"

From OPENRESEARCH mk copy Wiki
Jump to navigation Jump to search
Line 40: Line 40:
  
 
End date or dates in general are placed with strings.
 
End date or dates in general are placed with strings.
 +
 +
=== Acceptance Rate Issue ===
 +
 +
* Statistics for the missing values for Submitted papers:
 +
* Number of pages that have the field "Submitted papers" : <b> 1716 </b>
 +
* Number of pages that have the field "Accepted papers" : <b> 1965 </b>
 +
* Number of pages that have the field "Submitted papers" but no field of "Accepted papers" : <b>Approximately 63</b>
 +
* Number of pages that have the field "Accepted papers" but no field of "Submitted papers" : <b>Approximately 302</b>
 +
* In general the papers

Revision as of 01:02, 31 March 2021

Red Links and Data Fixations

Broken redirects

Broken file links

Broken properties and errors

  • The property Property:Has improper value for stores all invalid values.

Ordinals

  • For the Fix for ordinals the following approaches can be used :
    • This approach finds and edits all events with improper ordinals fixed:
wikiedit -t wikiId -q "[[Has improper value for::Ordinal]]" --search "(\|Ordinal=[0-9]+)(?:st|nd|rd|th)\b" --replace "\1"
  • A code snippet can be used coupled with wikibackup and bash tools for specific editing of pages: Code Snippet
  • Pipeline usage:
grep Ordinal /path/to/backup -l -r | python ordinal_to_cardinal.py -stdin -d '../dictionary.yaml' -ro -f | wikirestore -t ormk -stdinp -ui

Improper Null values for Has person

  • Has person was using "some person" as a null value. There was incorrect usage where in the free text events would use some person while the Wikison Format info would contain the person name.
  1. First way of doing this is to remove free text altogether. A code snippet was used coupled with bash utility grep. Usage:
 
grep 'some person' -r '/path/to/backup' -l | python scripts/Data_Fixes.py -stdin -ro -rmf | wikirestore -t ormk -stdinp -ui

Output result: ICFHR 2020

  1. Second way to do this is to only remove the 'some person' entry from the wiki free text. Python snippet is used with bash utility grep. Usage:
grep 'some person' -r '/path/to/backup' -l | python scripts/Data_Fixes.py -stdin -ro -rdf | wikirestore -t ormk -stdinp -ui

output result: ACCV_2020


Dates

End date or dates in general are placed with strings.

Acceptance Rate Issue

  • Statistics for the missing values for Submitted papers:
  • Number of pages that have the field "Submitted papers" : 1716
  • Number of pages that have the field "Accepted papers" : 1965
  • Number of pages that have the field "Submitted papers" but no field of "Accepted papers" : Approximately 63
  • Number of pages that have the field "Accepted papers" but no field of "Submitted papers" : Approximately 302
  • In general the papers