ASSESSMENT IN PUBLIC SCHOOL OF SINDH: PRACTICES, TRENDS AND IMPLICATIONS

Large-scale assessment is used to evaluate overall education system. It emerged during 1980s as a tool to inform education system across the world. In Pakistan, large-scale assessment was initiated by National Education Assessment Center. In different provinces and areas, assessment centers were also established. Today, ASER, PEACe and SAT are the major large-scale assessment conducted in public sector. This research aimed to examine the large-scale assessment practices in public schools of Sindh, the rationale of large-scale assessment and its contribution to improve overall education. This research is situated in the qualitative paradigm in which the data was collected through document reviews and analysis as well as interviews. The findings confirm that, major three types of large-scale assessments have been reporting dismal situation of students’ achievement. The score generally ranges between 25 and 30 which is an alarming condition. Textbook based teaching, unawareness of teachers about developing standardized and high order assessments and unavailability of basic facilities are the core issues associated with students’ low scores. The consistent low score demands to reconceptualize the rationale of large-scale assessment and the focus may be shifted to promote formative assessment and prepare public sector teachers to conduct effective and meaningful assessments. This research also recommends that the policies regarding assessment already exist which need to be implemented in their true spirit.


Introduction
Large scale assessment is largely used for the educational accountability at policy level (Wyatt-Smith, 2014). Unlike the developed countries the idea of assessment at large scale evolved relatively late in Pakistan. In 2003, National Education System (NEAS) was established in Islamabad which conducted six cycles of National Assessments. In order to support the National Assessment activities, Provincial Education Assessment Centers were established in all provinces and regions of Pakistan. These centers were part of five year plan funded by the World Bank and Department for International Development. After the culmination of the project these assessment centers either merged with some other provincial departments or started working independently. In Sindh, the Provincial Education Assessment Center (PEACe) Sindh was institutionalized and from 2009, it conducted six cycles of diagnostic assessment (Provincial Education Assessment Center Sindh, 2022. Focus has been grade 3, 4 and 8 while Mathematics, Language, Science and Social Studies were subject which were emphasized. After 18th amendment, education has become the provincial chapter and as a wing of Directorate of Curriculum, Assessment and Research Sindh @ Jamshoro, PEACe has been mandated to conduct large-scale assessment in Sindh. for grades III, V and VIII. Consequently, in 2017 and 2022 large scale assessments were conducted by PEACe in grade III and V, respectively. In these assessments, PEACe not only collects and compiles the score of students but also the background information of students, parents, teachers, head teachers and schools. This information then helps to correlate with the students' performance. Consequently, PEACe attempts to present a holistic picture of education system and affecting factors. In Sindh the Reform Support Unit (RSU) also initiated a project under the title of Standardized Achievement Test (SAT). The planning and administration of SAT was outsourced to Institute of Business Administration (IBA) Sukkur University. The aim of the SAT Project was to assess the performance of the students' learning in subjects Language (Sindhi/Urdu/English), Math and Science. SAT focused on grade 5 and 8. Unlike, sample based assessment of PEACe, SAT took the whole population which means that all the students who were studying in public school (grade 5 and 8) had to take this test. From 2011, six cycles of SAT were completed.
Another large-scale assessment is done by The Annual Status of Education Report (ASER) -Pakistan by Idara-e-Taleem-o-Aagahi (ITA). ASER conducts survey nationwide through systematic sampling in each district. The primary objective of the ASER survey is to generate estimates of children's schooling status and basic learning levels at district. The survey focuses household. However, in order to measure the learning level of the children between the ages of 5 to 16, they are assessed for reading, arithmetic and general knowledge skills. Unlike the diagnostic assessment or standardized test, this assessment is not referenced to the learning outcomes of children according to their specific grade and age. Rather it focuses the ability of reading, doing sums and knowing the general information which a second-grade student needs to possess. Further, students are assessed in the household environment and no child is tested in their school environment. According to the ITA, ASER is trying to fill the vacuum by asking, 'Are our children learning?' 45ISSN (P):2788-4821 & ISSN (O): 2788-483X Volume 3, Issue 1, Page 211-221, April 30, 2022 By focusing on these three large scale assessment activities, this paper will highlight the practices, trends of results, challenges and lesson learnt during the process of large-scale assessment, in the context of Sindh. Following are the research questions.
1. What are the large-scale assessment practices in public schools of Sindh? 2. What is the rationale of conducting large-scale assessment? Does it need to be reconceptualized? 3. How large-scale assessment is helpful to promote students' learning outcomes?

Literature Review
Large-scale learning assessments (LSLAs) is defined as a practice of national or crossnational standardized testing that provide a snapshot of learning achievement for a group of learners in a given time and in a limited number of learning domains (UNESCO, 2019). Many developed and developing countries have established their large-scale assessment system while at international level Programme for International Student Assessment (PISA) and rends in International Mathematics and Science Study (TIMSS) are also wellknown large-scale assessment systems. Historically, large-scale assessments can be traced back to 1985. Firstly, Organization of American States (OEA) administered standardized tests to a national sample. In 1988 and 1992, the Ministry of Education carried out further testing. Furthermore, National Assessment System for Educational Quality (SINECE) was established in 1996 which administered large-scale assessment on a sample of grade III, VI and IX students. This large-scale assessment was again conducted in 1998, 2000, and 2001(Rizo, 2010. Large-scale assessment is considered as a highly valued activity that informs the education policies, curriculum and practices (Cresswell et al., 2016). Large-scale assessments are considered to have a high degree of validity and reliability, and can help significantly improving the policies and practices which ultimately be beneficial for education across developed and developing countries. However, there are also critiques on the process of large-scale assessments, issues regarding purpose and process and also the utilization of results. For example, few large-scale assessments focus the numeracy and literacy skills of the youth. However, a large number of students in Pakistan are out of school that means the sampled students might have never been in school. Another issue with the large-scale assessment is that it shows only a bigger picture and comes up with general suggestions and recommendation for educational reforms (Gür et al., 2012). On large-scale assessment, very few researches are available in the context of Pakistan. This research synthesizes three major large-scale assessments conducted in Sindh, their practices, general results, challenges and lesson-learnt. 214 Sustainability in Lifelong Learning:

Methodology
This research was a descriptive in nature which aimed to describe the phenomenon of large-scale testing in Sindh. The descriptive nature suits this research because it highlights practices, opinions held, differences or relationships that exist, conditions, structures, ongoing processes or evident trends (Bhattacherjee, 2012). The qualitative methods were employed to collect the data heavily relying on document analysis. The major portion of the document analysis comes from the reports developed by PEACe, IBA Sukkur and ASER Pakistan. Researcher's reflections are also the part of the discussion. Semistructured interviews were also conducted with official of PEACe, IBA Sukkur and ASER Pakistan, mainly for validating the data, clarification of the process and the outcomes of the large-scale assessment, in order to make any careful comments or claims.

Analysis and Findings
If we synthesize the overall large-scale assessment activities carried out in Sindh, we can find few very significant themes to explore and discuss. The first area is the assessment practices carried out by the organization. Another, emerging theme is the consistent low score of learners and its implications. Moreover, it is interesting to look for challenges and lesson learn by the large-scale assessment activities in Sindh.

Assessment Practices
Though some short-term large-scale assessments are traced in past but PEACe, SAT and ASER have been conducting assessments on regular basis. The basic objective of all these assessments is to measure the students learning but they are different with each other too. Keeping this in mind, instead of describing each large-scale assessment, in this finding section, a cross-case analysis is presented. This will help the readers to have a holistic view of large-scale assessment.

Organizational Structure
It is important to understand the organizational structure of each organization. This will enable us to predict the future of large-scale assessment in the context of Sindh. PEACe is the only organization that is on the recurring budget and works as a public sector organization. However, they need to get their assessment plan approved by the department and due to administrative issues PEACe skips the assessment in some years. Since 2015, PEACe is able to complete only three assessments in 2015, 2017 and 2022. SAT was public sector intervention but it was a project that was out-sourced to the third party. After completing six cycles, SAT has been discontinued. Ideally, SAT should be phased out and some organization might have continued this which was unfortunately not the case. ASER is an international intervention. In Pakistan, it is managed by Idara-e-Taleem-o-Aagahi (ITA). ASER engages civil society and semi-autonomous partners. Consequently, more than 10,000 volunteers work for ASER. ASER Pakistan has grown from being active in 11 to 138 out of 145 districts in Pakistan, consistently providing ranked and gender disaggregated data across households, villages, districts and provinces (Idara-e-Taleemo-Aagahi,2020). 45ISSN

Method of Large-Scale Assessment
The methodology of each assessment is also different. PEACe uses Probability proportional to size (PPS) sampling. At first stage proportional to population schools are identified from all districts of Sindh. From each school only 10 students of targeted grade participate in the assessment. This is done to reduce the cost of the assessment. Each student attempts a test for the targeted subject. This test is constructed in the guideline of Provincial Curriculum and students are tested against the standards described in the curriculum. PEACe also administers the background questionnaire for students, parents, teachers and head teachers. The variables included in the questionnaire help making correlation of students' performance and the facilities they have (Provincial Education Assessment Center Sindh, 2017). The end product of this assessment is the test score and the inferences about the learning opportunities for the students. Result is grouped at district and provincial level. Two strata are focused in the result which are gender and location (i.e. urban or rural). In SAT, all the students of grade five and eight attempted the standardized test. The test comprised the subjects of language of instruction (i.e. Urdu, Sindhi or English), General Science and Mathematics. Students had to take Science and Mathematics test on the same day. Since all the students have to take the test thus a good mechanism for keeping schools informed has been devised. The test is constructed on the basis of curriculum and textbooks. SAT developed a very large Item Bank which was available online. Like PEACe, SAT also reported the score and ranked the districts and regions accordingly. There are two major differences between SAT and PEACe. First, PEACe tests sampled students while SAT used to test the overall population of the students. Second, PEACe focuses the overall system therefore, it provided detailed information regarding the factors effecting the learning achievement of students while SAT focuses more on the scores, students' report card and schools' performance. Though, ASER conducts household survey but assessment of children's learning levels is at the core objective as well as activity. ASER has developed simple, easy to use tools that measure children's competencies in Language (Urdu/Sindhi/Pashto), English and Arithmetic. The assessment is done in household environment. The tests are based on class one & two curriculum and measures acquired reading fluency and comprehension, and basic numeracy skills of a child. Without acquiring these basic skills children may never be able to learn. ASER assess the children, falling in the age group 5-16. The assessment practices are summarized in the following table.

What Does Assessment Data Tell?
All these large-scale assessments produce very important information about the educational system. The large-scale assessment data have been informing a) consistent low score of students b) dismal physical conditions of schools c) scarcity of basic facilities d) low capacity of teachers and e) poor socio-economic condition of students. The data show consistent low score in core subjects. Though the comparison of grade IV and III data is not justified but it can be seen that in language the students score got lesser in 2017. When we split reading and writing score, we further know that students are having very dismal writing skill. For example, in 2017 the writing score in Sindhi, Urdu and English was 30.46%, 26.74% and 21.03% respectively while students score in reading portion of these subject was 58.18%, 26.74% and 21.03%. A big difference in English (40.30 % in 2015 and 25.03 % in 2022 is alarming. In 2022, the students' scores in Sindhi Writing (21.05 %) and Urdu Writing (25.84 %) are also need consideration. The results of 2022 may also be seen from the angle of learning losses of COVID 19. Likewise, SAT has been reporting dismal students' achievement. In Science and Math, a gradual improvement is evident in both grades. Languages' score seems consistent. However, score ranging between 20 and 35 is unacceptable at any standard. 45ISSN   Performance of youth of Sindh in ASER is a great concern where in above 12 instances Sindh could only better than FATA and Baluchistan in three instances. In nine instances, Sindh was at the bottom. Some gradual progress may be witnessed in Reading Story in Mother Tongue. However, in English and Math the situation does not change a lot. At national level things are gradually getting better but a drastic change may not be expected. Pakistan also participated in TIMSS in 2019. Performance in TIMSS was dismal. Pakistan stood second from the bottom. Only 27% of 4th grade students in the country could meet the low international benchmark in mathematics, eight percent could touch the intermediate international benchmark, and just one percent could reach to the high international benchmark (Halai, 2021).

Reasons of Low Score
Following are few reasons identified by the experts who have been part of large-scale assessment. One of the biggest reasons of low score is the difference in large-scale assessment and the assessment practices generally enacted in classroom. 'These largescale assessments 218 Sustainability in Lifelong Learning: Management Perspective from Pakistan are developed on the concepts presented in the curriculum and cover all the learning areas describe in curriculum (Participant 3, September 2, 2021)'. On the other hand, in our context assessment system in classrooms focuses on measuring knowledge only. 'Students' learning is tested by rote learning and memorization process. Questions generally demand students to reproduce their knowledge. An examination of three hours in whole year is considered as assessment (Participant 2, September 4, 2021)'. The classroom assessment is based on knowledge retention that asks students to reproduce exact manuscripts written in their notebooks (Shazadiy & Rafa, 2018). This issue of difference in assessment practices may also be associated with teachers' preparedness. Alif Ailan (2015) claims that 58 per cent government school teachers have no knowledge of the national curriculum. The condition becomes more fragile while it is added that 73% teachers have never attended any courses on assessment techniques during their pre-service training. Another big issue is the course coverage which is associated with the Scheme of Studies and education planning. The large-scale assessments are developed from the whole curriculum as per proportion allocated to each learning areas. 'However, mostly teachers are unaware about the learning areas and they teach with the help of the textbook in a linear manner (Participant 1, September 7, 2021)'. Consequently, the concept at the end of textbooks are most likely not covered (Provincial Education Assessment Center Sindh, 2017). Availability of resources is a huge challenge. This embraces many aspects. Other than basic facilities, availability of teachers is one serious issue. In Sindh, 49% schools are single-teacher schools where multi-grade teaching is the ultimate way of teaching (Sindh Education Sector Plan, 2014). Multi-grade teaching is different from single grade teaching and there is no proper training for the teachers to handle multi-grade teaching (Cheema, 2017). The unavailability of physical resources is also an issue. A large number of schools lack shelter, drinking water, functional toilets, electricity and boundary wall. In December, 2021, education department shut down 4,901 schools in different districts of the province which had no proper buildings, no enrollment, and no teachers (Sindh School Education and Literacy Department, 2021). This dismal state demands to sort out issues with comprehensive but immediate plan. However, this condition also raises the question regarding conducting the large-scale assessment and somehow, lesser useful results.

What is the Need of Large Scale Assessment?
The use of large-scale assessment data has always been a question. Trends show that not much is being changed in Sindh. Even in some areas, the score is falling down. Interestingly, during last decade few comprehensive interventions from donor agencies have taken place in Sindh. Strengthening Teacher Education in Pakistan by Institute for Educational Development-Aga Khan University, Pakistan Reading Project and Sindh Reading Program by United States Agency for International Development are few examples (Pardhan, 2017;Hussain, Jamaludin & Mehmood, 2019). However, consistent low score suggests that the interventions to improve educational system are either not effective or at least not sustaining. In such scenario, it is important to reconsider the rationale of large-scale assessment and find the reasons of low-scores instead of repeating 45ISSN (P):2788-4821 & ISSN (O): 2788-483X Volume 3, Issue 1, Page 211-221, April 30, 2022 the large-scale assessment activities and with huge amount of finance and effort producing the same results. Alternatively, a model of formative assessment may be devised, disseminated and practice. The decayed assessment system in not only public but also in private schools has dented the skills and abilities of students by and large. The effects we can witness in competitive exams, entry tests and recruitment tests. In recently conducted recruitment test for the appointment of Primary School Teachers (PST) and Junior Elementary School Teachers (JEST) a large number of candidates could not get passing marks. Only 6.3% candidates could pass PST and 0.84% JEST recruitment test. Therefore, in October 2021, the provincial cabinet made certain changes and decided to remove the important condition of getting 45 percent marks in each subjects to be selected. It was also decided to lower the passing marks to 50%. Candidates from minorities, girls from the hard areas and differently abled persons were given the leniency to qualify the test against 33% marks . In another instance in December, 2021, the Sindh cabinet lowered the passing marks for admission in medical universities or colleges from 65% to 50%. Cabinet had serious reservations with the testing procedure set by Pakistan Medical Commission . These instances show that instead of making efforts to overhaul, the assessment system, youth is given leniency which may back fire and we get less competent professionals in the most sensitive fields like education and health .

Assessment after Learning Losses of COVID-19
The year 2020 was badly hit by COVID 19 when this pandemic hit the education system in more than 200 countries. In Sindh, schools remained closed for most period of academic year 2020-21. A smart syllabus was developed and SELD learning App by Muse and Digital Classroom by Microsoft were launched to facilitate children continued online learning. However, due to digital divide, a large number of students could not use the online resources because of unavailability of digital devices and internet. In 2020, summative examinations were not conducted and students were promoted to next grade. In 2021, the assessment was conducted from condensed (reduced) syllabus. 'In primary schools, no proper examinations were conducted. Students were again promoted to the next class with very little evidence (Participant 4, September 9, 2021)'. It is expected that during online learning the assessment would have been compromised. Therefore, it may be assumed that many students could not experience formative assessment who may face difficulties in assessment after COVID-19.

Conclusion and Recommendations
A huge amount of efforts and finance are put to overhaul the quality of education in public schools through large-scale assessment. Other than PEACe, other actors like IBA Sukkur University also conducted large-scale assessment. However, there is need to rethink about the status, need and effectiveness of large-scale assesses. The organization conducting large-scale assessment should have close and effective coordination with the stakeholder in order to implement the recommendations presented to the policy makers for improving 220 Sustainability in Lifelong Learning: Management Perspective from Pakistan the education system. Furthermore, it is the need of the day that instead of conducting large-scale assessment, there should be efforts to engage teachers to align their classroom practices with the standards and benchmarks that are assessed. Without bringing teachers on-board, the intended outcomes of the assessments cannot be achieved.
The analysis of large-scale assessment practices, results and their implications demand to reconceptualize the assessment practices at classroom level. Instead of conducting the large-scale assessment again and again, teachers may be oriented about formative assessments. Their skills for developing high order questions and making a balance test must be polished. This idea is not new as this has also been perceived by the Government of Sindh and it developed two very important documents in 2015, first Policy on Sindh Assessment and Examinations (PSAE) and second, Sindh Education Student Learning Outcome Frame work (SESLOAF). PSAE enlists the policy action to improve the assessment practices and SESLOAF provides a detailed guideline for conducting formative and summative assessments in classroom. However, ground realities show that these policies might have not been implemented properly. The main stakeholders are the teachers, their unawareness about the policies shows the gap between policy and practice.
Assessment is one of the standards of National Professional Standards for Teachers (2010) which is equally important for perspective as well as in-service teachers. Assessment is the part of teacher education but there is need to strengthen this area that prospective teachers will be better prepared to assess students. On the other hand, in-service teachers should be given proper training for assessing students, not only summative but also in formative way. In recently developed Continuous Professional Development (CPD) Model, SELD has decided to include assessment as core component. Through professional CPD Model, teachers will be given on-site trainings regularly through Guide Teachers and Subject Coordinators. This is very encouraging. However, in the current era of media and technology, teachers can be oriented about assessment through videos and webinars which is cost effective and remote method to reach more teachers.