CAREER SELECTION OF STUDENTS USING HYBRIDIZED DISTANCE MEASURE BASED ON PICTURE FUZZY SET AND ROUGH SET THEORY

Original scientific paper Abstract: Since the future of the society depends upon the role of students, so suitable career selection methods for the students are considered to be an important problem to explore. It is assumed that if a student has the required capability and positive attitudes towards a subject, then the student will achieve more in that subject. To consider the uncertain issues involved with students’ career selection, picture fuzzy set (PFS) and rough set based approaches are proposed in this study as they are found to be appropriate due to their inherent characteristics to deal with incomplete and imprecise information. For the purpose of selecting a suitable career, the article analyzes student's features in terms of career, memory, interest, knowledge, environment and attitude. We propose two hybridized distance measures using Hausdorff, Hamming and Euclidian distances under picture fuzzy environment where the evaluating information regarding students, subjects and student's features are given in picture fuzzy numbers. Then we present an algorithmic approach using the proposed distance measures and rough set theory. We apply rough set theory to determine whether a particular subject is suitable for a student even if there is controversy to select a stream. Lower and higher approximation with boundary region of rough set theory is used to manage the inconsistent situations. Finally, two case studies are demonstrated to validate the applicability of the proposed idea.


Introduction
Selection of the subject for a better career of a student is a vital task since it is concerned with future employment which influences the whole life of the student and ultimately leads to social development. It is a delegate decision for the students as they are ambitious about it (Batool et al., 2020, Van Dinh et al., 2019, Pratiwi et al., 2020, McKenzie et al., 2020, Wang et al., 2020, Babajide et al., 2020, Orewere et al., 2020. Some students are in confusion or careless to choose a stream. The major confusion is that whether the stream is suitable for their future establishment and their eligibilities are sufficient or not for analyzing and studying efficiently and sincerely. Sometimes many students have no faith in them as they do not know clearly what they will study or what is the content of a particular subjects or stream. As found in various literature related to different colleges and streams (Wen et al., 2018, Nehmeh et al., 2018, we have observed that students often face difficulty in choosing proper stream and make it as a career. To solve the students' career selection problem, many authors have contributed in the last few decades. We have summarized some of the significant contributions which are narrated below. In (Wen et al., 2018), authors have discussed the career choice issue related to choosing accountant as the career, which influenced the researchers to search the various methods for choosing a career for the student by selecting an appropriate subject. The survey of 216 students has shown that both of the internal and external factors influence the career selection process (Babajide et al., 2020). A survey of three hundred students has shown that parents' economic status or social class, financial support, decision making and learning abilities have a huge impact on career decision of students (Batool et al., 2020). Research has also revealed that in order to fulfil parents' expectation and for family or cultural values, few students choose their career in medical school in spite of having lower academic performance (Griffin et al., 2019) which in future hampers their career. Students' engagement, family encouragement, family capital and various scientific matters also encourage the students to study science (Silseth et al., 2018) and select their career accordingly. Although the scientific beliefs, teachers' and parents' expectations, sense of encouragements, and academic prediction, motivate the students to choose science as their career but there is an explicit gap between the motivation factors and the career selection which has been illustrated in (Ramentol et al., 2019). In the rural areas, the environment such as family poverty and rurality often influence the choices of students' career (Carrico et al., 2019). In (Holloway-Friesen et al., 2018), the study found that academic persistence, pursuit of career goals, and high career expectations are significantly influenced by the college environment. Again, it is observed that the environmental impact in terms of more guidance and counselling centers influence the student to choose the right career (Orewere et al., 2020). Hence it can be concluded that the impact of environmental factors like the parent, teacher, family, place, and the institute is a major issue to choose a career. Along with the environmental factors, few other factors are also there. The study of 502 students showed that the different factors like self-efficacy, outcome expectation and career intention have more impact on career choice (Pratiwi et al., 2020, McKenzie et al., 2020. A survey on student mentioned in (Madden et al., 2018) specified that hardness and softness of a subject do not matter for a student if he/she has an interest in that subject. Interest in choosing profession influence the performance in the service and activities (Alkaya et al., 2018). Goel et al. (2018) investigated that the decisions to join the medical profession are mainly dependent on the factors like scientific (interest in medicine), socials (respect/prestige) and humanitarian (desire to help others). Hannula et al. (2002) observed four ways to evaluate the attitude of a student which are emotions aroused in the situation, emotions associated with the stimuli, expected consequences and related situation to personal values. Positive attitude towards a subject plays an important role, where the positive attitude influences the expected achievement (Guido et al., 2018, Burns et al., 2018. Based on the above discussion, it can be concluded that extrinsic motivation, intrinsic interest and perceived support and encouragement to a particular subject strongly aspire the students to study that subject. Hence by analyzing the related concerns, we find that the specific characteristics of a student may be the cause for the success in a stream. Various studies mentioned above unpack that attitude, knowledge, interest, career, memory and environment are might be considered to be the key factors for students' success. Two computational intelligence techniques, i.e., picture fuzzy set (PFS) and rough set (RS) theory have a big role to predict the career for students and other decisionmaking process (Kumbhar et al., 2020, Si et al., 2020, Das et al., 2018, De et al., 2019. In (Dutta, 2018), PFS is proposed for medical investigation and diagnosis. PFS is an extension of intuitionist fuzzy set (IFS) to deal with uncertainty in the situations involving more answers of the types: yes, abstain, no (Cuong et al., 2013), whereas a rough set is a pair of crisp sets, i.e., the lower approximation of set and upper approximation of the original set and these sets may contain fuzzy values (Pawlak, 1995). A detailed study on picture fuzzy set is found in (Cuong, 2014). Picture fuzzy clustering algorithm can be developed for exploiting and investigating hidden knowledge from data. Hierarchical picture clustering is proposed in (Son, 2016) which is an integration of generalized picture distance measure and hierarchical picture fuzzy clustering. The PFS are also useful for computational intelligence problem (Cuong et al., 2013). In the other side, a probability-based rough set theory is proposed in (Ramentol et al., 2019) to predict how likely a student is to succeed in the academic year. An algorithm for decision making based on rough fuzzy set with -clustering and upper-lower ( , ) -approximation is modelled in (Ramentol et al. 2019). The intuitionistic fuzzy rough set (Tan et al., 2018) is a combination is of intuitionistic fuzzy (IF) and RS theory, where IF relations are defined and characterized by the lower and upper approximations of the rough set. Then the measures are developed to evaluate approximation quality and ability of classification. The notion of picture fuzzy rough soft set is introduced in  which the combination of PFS and RS, and this formulation is used for classification, decision making, and knowledge discovery. In , picture fuzzy rough soft sets and picture fuzzy dynamic systems are introduced and these are the extensions of PFS with its applications. Van et al. (2019) defined the distance measure between PFS with similarity and dissimilarity which are useful for image segmentation, decision making and pattern recognition.
As found in literature, none of the researchers has contributed to the student career consideration based on the hybridization of distance measure using PFS and RS. Choosing a stream as the career is a research area of applied fuzzy set theory. After having the basic knowledge of different subjects in school, the student is in a situation to make decisions on the career for whole life by choosing a stream. But due to the imprecise and incomplete nature of information regarding different streams, the concept of a fuzzy set is inevitable for decision-making purpose. In (Dutta, 2018), authors discussed the application of PFS in medical diagnosis, but as per our knowledge, a very few researchers have applied fuzzy set theory in career selection although the deciding on choosing stream is a vital and critical task (Wen et al., 2018). Our proposed model, PFS with hybrid distance measures, has the intension to find the subject having the minimum distance from the student and help to decide to choose the right stream. Some students have positive potential towards a particular subject like a student has high caliber in computer science and so the degree of positive Career selection of students using hybridized distance measure based on picture fuzzy set and rough set theory 107 potential is to be taken into consideration when choosing a stream as a career. Again, some students have less or negative potential towards a particular subject like having high caliber with literature or art may have less or negative knowledge in mathematics. Hence, the degree of negative potential towards the stream is to be considered. It is observed in a career choice that some calibers may have a neutral effect on acquiring subject, i.e., the degree of neutrality can be considered for the subjects like physics, mathematics, or computer science, say neither negative nor positive. Therefore, it is convenient to consider the degree of neutrality in career considering.
In this article, we present a PFS-based approach which uses the hybridized distance measure for choosing a suitable and appropriate career for a student, where the rough set is used to avoid any kinds of confusion in choosing a stream. When students complete school education, they need to choose a suitable and particular stream as their career or to fulfil their ambitions. To find out the student's ability and then assign a stream to him/her according to suitability and appropriateness is a major challenge. To resolve the issue, we have quantified two qualitative concepts i.e., the requirement of the student to understand the subjects and the student's abilities towards the subject. Hence, we have considered the subjects with its features concerning the student's interest, knowledge, memory, career, attitude and environment impact as the general requirement. We have also considered a student's features as interest, knowledge, memory, career, attitude and environment impact towards the specific stream. With these two quantities, we investigate the distance between the student and the subjects using the proposed hybridized distance measures and then select the subject as the career which has a minimum distance from the student. In the process, we may face some situations as defined below. a) Eligible for more than one stream b) Neutral for more than one stream. c) Perfect for one subject. d) Not eligible for the subjects. There will be no problem for case (c), but for cases (a), (b) and (d), the decision is critical. These cases can be concluded when choosing a stream using Rough Set (RS) theory. For Example, if computer science, mathematics, physics have the same distance and also minimum distance, there will be fuzziness or inconsistency. Again, for choosing computer science with a specialty in data science as a career, the student must have caliber on computer science, mathematics, statistics and in this case fuzziness or inconsistency may arise, which is resolved using rough set theory.
We have proposed two hybridized distance measures namely hybridization of Hausdorff and Hamming distance measures, and hybridization of Hausdorff and Euclidean distance measures based on Hausdorff, Hamming and Euclidian distance for measuring the distance between student's features and stream's features. The student's features are summarized based on the degree of positive potential, the degree of neutral potential and the degree of negative potential towards the subject whereas subject's features are identified with interest, subject knowledge, memory, career, attitude and environmental impact as the potential's requirement. Also, the refusal degree in sense of neither positive, negative and neutral is taken into consideration. Again, for interest, subject knowledge, memory, career, attitude and environment impact, the degree of positive potential, the degree of neutral potential, the degree of negative potential and the refusal degree are considered. Thus, to choose a career we have taken the degree of positive potential, the negative potential, the degree of neutrality and refusal degree of student, interest, subject knowledge, memory, career, attitude and environment impact. For each student, we have a fourdimensional vector of information towards a particular subject and for each subject, we have a set of four-dimensional vectors where the set contents six elements. We have illustrated the proposed methods using two case studies. The workflow diagram of our proposed model is depicted in figure 1.
Subjects' requirement concerning a student characteristic Student's characteristic towards a subject Distance measurement between subjects' requirement and student's characteristic Whether Multiple subjects combinedly consider selecting a subject?
Consider the subject for a student having a minimum distance measure Rough set theory with picture fuzzy theory model is used to consider the subject YES NO Rest of this paper is structured as follows. We have noted the basic concepts used in the proposed models in Section 2. Section 3 presents the proposed model followed by its application in deciding a stream as a career of a student in Section 4, where two case studies are illustrated for choosing the stream. The first case study is implemented using PFS and the second case study is implemented using both PFS and RS. Comparative study is given in Section 5. Finally, in Section 6, we have given our conclusion with possible future works.

Preliminaries
In this section, we have discussed, some basic concepts related to this paper.
Career selection of students using hybridized distance measure based on picture fuzzy set and rough set theory 109 2.1. Picture Fuzzy Set (PFS) (Cuong, 2014) PFS A on X is an object of the form Here ( ), be the degree of positive membership of x in A, ( ), be the degree of neutral membership of x in A, and ( ), be the degree of negative membership of x in A, and ( ) be the refusal degree of x in A.

Euclidean Distance
where a and b are points of sets P and Q respectively and d (a, b) is any distance metric between the points of P and Q. (Pawlak, 1995) Let (U, A) be the Information System (IS), where U be a set of objects and A be a finite set of attributes such that, ꓯ α є A, α: U -> Vα, where Vα is the value set of α and U ≠ Ø, A ≠ Ø. T = (U, AU{γ}) is the decision system, where the attributes contained in A are condition attributes and γ is the decision attribute. The RS theory deals with imperfect knowledge which is expressed by boundary region of a set, and defined with topological operations, interior and closure approximation.

The indiscernible relation
R⊆ X × X is a binary relation satisfying reflexive, symmetric and transitive property. For x ∈X, the equivalence class is [x]R = {y| x R y for y ∈ X}. IS = (U, A) is an information system and for B ⊆A, the associated equivalence relation is defined as: c. B-outside region, U-̅ X consists of the objects which are not belonging to X.
d. If boundary reason is non-empty, then we have the rough set.

Rough membership function
The rough membership function quantifies the degree of relative overlap between the set X and the equivalence class R(x) to which x belongs and is defined as follows. : | ( )| , and |X| denotes the cardinality of X. The membership function of the rough set is expressed as (i) The conditional probability that x belongs to X given R.
A degree that x belongs to X given information about x expressed by R.
Using rough membership function, approximations and boundary region of a set are as follows.

Dealing with inconsistency situations
If the inconsistency situations are shown then it may be solved using one of the following actions. a) Consult the expert for taking actions.
b) Make different tables for the conflicting situation.
c) The examples having with less support should be removed. d) Basing on upper approximation set and lower approximation set, the quality method can be used to solve inconsistency.
e) The method of generating new decision attributes.

Proposed method
This section presents the proposed method. Initially, we present two hybridized distance measure namely hybridization of Hausdorff and Hamming distance and hybridization of Hausdorff and Euclidean distance measures. Then we present an algorithmic approach using these two distance measures. The hybridization of Hausdorff and Hamming distance measure A distance measure is an objective score that summarizes the relative difference between two objects in a problem domain. Hausdorff distance is the greatest of all distances from a point in one set to the closest point in the other set. Hamming distance calculates the sum or the average differences between the two values. To calculate distance from a vector data to set of vectors data, we have used a set of distance values derived using Hamming distance and a final distance is concluded using Hausdorff distance. Thus, we have the hybridization for x ∈ X, Euclidean distance calculates the distance between two real-valued vectors and it is the square root of the sum of the squared differences between the two vectors. Using Hamming distance, from a vector data to set of vectors data, we have a set of distance values and the final distance is concluded using Hausdorff distance. Thus, we have the hybridization for x ∈ X as ( , ) = (( − ) 2 , max (( − ) 2 )), ∈ .
where X and Y are a set of vectors.

Algorithmic approach
Step 1. Degree of measurements with PFS is noted for each subject. For each subject, interest, subject knowledge, memory, career, attitude, environment impact requirement are the characteristics. Each characteristic value is quantified with four grounds i.e., the degree of positive potential, the degree of neutral potential, the degree of negative potential and the refusal degree. Thus, we have The refusal degree for the stream x denoting ( ) as follows.

= {( , ( ), ( ), ( )): }
where, S is the set of students, ( ), ( ), ( ) represent the degree of positive potential, the degree of neutral potential, and the degree of negative potential towards the subject respectively.

( ) = 1 − ( ( ) + ( ) + ( ))
Step 3. Hybridized distance measures i.e., Hybridization of Hausdorff and Hamming distance measures and Hybridization of Hausdorff and Euclidean distance measures are calculated between student and subject. The subject features in terms of PFS are defined in step 1 and student caliber towards the subject in terms of PFS is defined in step 2.
Step 3 is repeated for all the subjects. Then the subject having minimum distance is noted.
Step 5. If the subject distance measure from the student is sufficient to decide the career, then the subject having minimum distance is chosen for the career. Otherwise, we have to follow the next step.
Career selection of students using hybridized distance measure based on picture fuzzy set and rough set theory

113
Step 6. If multiple subjects have the minimum distance or more than one subject is required to consider for choosing a career, we use RS theory for solving the fuzziness or inconsistency.
Step 6.1. All subjects (say set S) are approximated by constructing the B-lower approximation and B-upper approximations of S according to distance measures set B (the information as the distance measures from student to subjects) which are respectively stated as B = { |[ ] ⊆ } and ̅ = { |[ ] ∩ ≠ ∅}.
Step 6.2. B-boundary region of S, BNB(S) = ̅ S -BS consists of the subjects which are not classified into S in B.
Step 6.3. B-outside region, U-̅ S consists of the subjects which are not suitable for a student.
Step 6.4. RS theory is used when the boundary region is non-empty, where we the membership function values of the subjects are considered using probability and subjects with high membership function values are selected.

Case study
We have explained two case studies. In case study 1, we have illustrated the distance between students and subjects, then find out the minimum distance for choosing a stream. In case study 2, we have illustrated the inconsistency situation for choosing and to solve its rough set theory is implemented.
Case study 1: This case study is based on students' distance measures from a stream for selecting the career using two different approaches i.e., Hybridization of Hausdorff and Hamming distance measures and Hybridization of Hausdorff and Euclidean distance measures. Steps to be followed for the purpose is as follows.
Step 1. Find the required calibers for the subjects.
Step 2. Find the calibers of the student towards different subjects.
Step 3. Measure the distance between different subjects from the student using the approaches Hybridization of Hausdorff & Hamming distance measures and Hybridization of Hausdorff & Euclidean distance measures.
Step 4. A suitable stream is selected for the student having minimum distance measure.
In this case study, we have taken five subjects such as computer science ( 1 ), physics ( 2 ), chemistry ( 3 ), mathematics ( 4 ) and statistic ( 5 ) with their features as interest ( 1 ), subject knowledge ( 2 ), memory ( 3 ), career ( 4 ), attitude ( 5 ) and environment impact ( 6 ) as a potential requirement towards respective subjects as summarized in table-  Then, we have considered four students 1 , 2 , 3 4 and their potentials towards the subjects 1 , 2 , 3 , 4 5 in terms of PFS B as follows. = {( , ( ), ( ), ( )): } Where, S is the set of students, ( ), ( ), ( ) represent the degree of positive potential, the degree of neutral potential, and the degree of negative potential towards the stream respectively. Career selection of students using hybridized distance measure based on picture fuzzy set and rough set theory
Thus, for student 1 and stream 1 , the Hausdorff and Hamming distance is 0.2278703839.
Similarly, for other students and subjects, we have summarized in table 3.  For student 1 ∈ let us consider for stream 1 ∈ where 1 stands for computer science.    table 3 and table 4, it is found students 1 , 2 , 3 have efficiency for chemistry ( 3 ) where student 4 have efficiency for physics ( 2 ) since they have minimum distance in respective subjects as highlighted in boldfaced.
Case study 2: This case study is the illustration of the extension approach of case study 1. We have used PFS with hybrid distance measure as well as RS theory for choosing a career to consider whether a stream is suitable for a student or not. PFS and hybrid distance measure have used for finding the distance between student's potential from the required potential for a stream. Then the RS theory is implemented to choose the best stream from different options and criteria. First, we have used picture fuzzy set and hybrid distance measure and then RS theory with the distance measures to finalize the suitable stream. The steps have followed is summarized below.
Step 1. Noted the required efficiency for the subjects.
Step 2. Find out the data of students according to their efficiency towards the subjects.
Step 3. Measure the distance between different subjects from the student using the approaches Hybridization of Hausdorff & Hamming distance measures and Hybridization of Hausdorff & Euclidean distance measures.
Step 4. Use RS theory to categorize the students according to their distance measure and also found out the students are in fuzziness.
Steps 5. Solve the fuzziness using the rule (c) of section 2.5.
Step 6. Finalize the stream for the students.
This case study is the study of 10 students 'efficiency towards computer science say x1. It is found that choosing computer science as stream not only depends upon computer science ( 1 ), it also depends on efficiency in mathematics say 4 and statistic say 5 . It is noted in the previous case study that the requirements of efficiency for 1 , 4 5 are defined in   Here U = { 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 } be the set of students, A ={ 1 , 2 , 3 } be the subjects namely computer science, mathematics and statistic respectively, and Y stand for computer science as the decision variable. For α ∈ A, define α ( ), i = 1 to 10 as distance measure. Our decision attribute y = computer science has criteria that for α = 1 , Vα = 1 , for α = 4 , Vα = 2 and for α = 5 , Vα = 3 where 1 , 2 and 3 are the threshold values of corresponding subjects. The relation is defined as R1: X → Y if Vy ≤ . Vy is the distance measure from X, is threshold value and if Vy is less than equal to then it is interpreted that eligible for computer science.
First considering for Hybridization of Hausdorff & Hamming distance measures, when we have taken as R1: X → Y with Vy ≤ 0.3, y as computer science, Vy ≤ 0.2, y as mathematics and Vy ≤ 0.4, y as statistics then the conclusion is as follows.   Thus, in both distance measure illustration, we have found, for students 3 , 5 , 6 , 7 , 8 , 10 are in inconsistent situations. We may follow the different rules summarized in section 2.4 for the situations. By removing fewer support cases (defined in section 2.2.4 rule c) and following the membership function values for 3 , 5 , 6 , 7 , 8 , 10 , we are in conclusion that 5 , 6 7 are eligible for computer science whereas 3 , 8 10 are not. Finally, we are in conclusion 1 , 2 , 4 , 5 , 6 , 7 may choose the stream and eligible for computer science whereas 3 , 8 , 9 , 10 are not.

Comparative study
Fuzzy set-based approaches have also a good contribution on the students' career selection process (Natividad et al., 2019). In this paper, we have attempted to improve the students' career selection process by incorporating more attributes regarding students' career selection which are represented using PFNs. These greater number of attributes are required to predict a more suitable decision in comparison to proposed fuzzy-based approach given in (Natividad et al., 2019). We have interpreted the attributes using four concepts i.e., the degree of positive potential, the degree of neutral potential, the degree of negative potential and the refusal degree for each attribute. In (Nguyen et al., 2018), authors studied fuzzy linguistic approach for multicriteria decision making by considering the interest of the student but practically along with interest other factors are also there. The proposed approach has considered the other factors also like subject knowledge, memory, career, attitude and environmental impact. In (Peker et al., 2017), the authors have claimed that the students' prior educational successes and teachers' views are combinedly important to identify the students' professional interests and capacities. In the process, the authors proposed a web-based system, namely WEB-CGS, which is modelled using Mamdani fuzzy model where students' interest are interpreted using the traditional methods of question-answering and evaluation by teachers, which may not be always accurate for selecting a career. In (Nie et al., 2018), the authors have worked on the students' information like skill, regularity, economic status and subject interest, and trained that information using machine learning techniques for future forecasting, but the study does not interpret the skill, regularity, economic and interest exclusively which major an issue for accurate prediction. Our proposed method analyses students' information in terms of attitude, knowledge, interest, career, memory and environment and expressed that information using PFS with the degree of positive potential, the negative potential, the degree of neutrality and refusal degree for both of the students and subjects. Again, RS is implemented in the proposed approach when confusion arises in choosing stream, whereas none of the mentioned methods used it in the same context.

Conclusion
Students are in confusion and feel difficulty in choosing the stream as their career after basic schooling. PFS and RS based approaches are useful in selecting the stream that will be appropriate for a student. The hybridization of Hausdorff & Hamming distance measures and hybridization of Hausdorff & Euclidean distance measures are proposed to find the distance between the student and subjects with the attributes interest, subject knowledge, memory, career, attitude and environment impact. The subject having the minimum distance from the student is chosen as the suitable and appropriate for the student. The rough set theory with lower approximation, higher approximation and boundary region is proposed to find out the inconsistency situations when a particular stream is taken into consideration. Thus, we have proposed two models for choosing a career one for selecting a subject and another for selecting a subject in inconsistency situation and for both two case studies are illustrated. Our proposed models have taken the subjects' attributes values with the degree of positive membership, the degree of neutral membership, and the degree of negative membership when considering PFS. Also, the models have considered the students' attributes values with the degree of positive membership, the degree of neutral membership, and the degree of negative membership when considering PFS. Finding the degree of positive membership, the degree of neutral membership and the degree of negative membership values are also challenging jobs. Hence our future work will focus on to make a model to generate the degree of positive, the degree of neutral, and the degree of negative values when considering PFS. Again, it is possible to extend with adding more attributes that influence the students on choosing streams.
Author Contributions: Each author has participated and contributed sufficiently to take public responsibility for appropriate portions of the content.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflicts of interest.