sql – PARTITION BY Name,Id用于比较和检测问题
|
交代 想象一下,这里有3家公司.我们按姓名加入表格,因为并非每位员工都提供了他的PersonalNo. StringId只有专家,所以它也不能用于加入.同一名员工可以在多家公司工作. 问题 问题是可能存在具有相同名称的不同员工(具有相同的名字和姓氏,在示例中仅提供名字). 我需要的? 当数据有问题时返回1,如果正确则返回0. 检测问题的规则 >当有多个相同的名字(2个或更多)并且所有人都具有相同的PersonalNo而不是所有人都有StringId(如彼得)应该返回1(这是错误的) 样本数据 Company Name PersonalNo StringId Comp1 Peter 3850342515 85426 ------------------------------------------------------------------- Comp2 Peter 3850342515 '' -- If have the same PersonalNo and there is no StringId - 1 (wrong) Comp1 John NULL 12345 ------------------------------------------------------------------ Comp2 John 3952525252 12345 -- If have the same StringId and 1 PersonalNo is NULL - 0 (correct) Comp1 Lisa 4951212581 52124 ---------------------------------------------------------------- Comp3 Lisa 4951212581 52124 -- If PersonalNo are equal and StringId are equal - 0 (correct) Comp1 Jennifer 4805250141 '' ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Comp1 Jennifer 4920225088 55443 -- If have 2 different PersonalNo and NULL PersonalNo,but where PersonalNo is NULL Comp3 Jennifer NULL 55443 -- Have the same StringId with other row where is provided PersonalNo it should be 0 (correct),with different PersonalNo where is no StringId shouldn't appear at all. Comp1 Ralph 3961212256 '' -- Shouldn't appear in select list,because only 1 row with this PersonalNo and there is no StringID 期望的输出 Peter 1 John 0 Lisa 0 Jennifer 0 QUERY LEFT JOIN (SELECT Name,(
SELECT CASE WHEN MIN(PersonalNo) <> MAX(d.PersonalNo)
and MIN(CASE WHEN StringId IS NULL THEN '0' ELSE StringId END) <> MAX(CASE WHEN d.StringId IS NULL THEN '0' ELSE d.StringId END) -- this is wrong
and MIN(PersonalNo) <> ''
and MIN(PersonalNo) IS NOT NULL
and MAX(rn) > 1 THEN 1
ELSE 0
END AS CheckPersonalNo
FROM (
SELECT Name,PersonalNo,[StringId],ROW_NUMBER() OVER (PARTITION BY Name,PersonalNo ORDER BY Name) rn
FROM TableEmp e1
WHERE Condition = 1 and e1.Name = d.Name
) sub2
GROUP BY Name
) CheckPersonalNo
FROM [TableEmp] d
WHERE Condition = 1
GROUP BY Name
) f ON f.Name = x.Name
查询的问题是我只能按名称分组,不能将PersonalNo添加到GROUP BY子句,所以我需要在选择列表中使用聚合.但是现在它仅比较MIN和MAX值,如果有超过2行具有相同的名称它没有按预期工作. 我需要做类似的事情,比较PARTITION BY Fullname,PersonalNo的值.它现在比较具有相同名称的值(不依赖于PersonalNo). 有任何想法吗?如果您有任何问题 – 请问我,我会尽力解释. 更新1 如果有2个条目具有不同的PersonalNo,但它们的StringId相等,则应为1(错误). Company Name PersonalNo StringId Comp1 Anna 4805250141 88552 -- different PersonalNo and the same StringId for both should go as 1 (wrong) Comp1 Anna 4920225088 88552 现在回来像: Anna 0 Anna 0 它应该是: Anna 1 更新2 在Identifier列中UNION更新后返回StringId:55443(对于下面的数据),但在这种情况下,当1个条目具有PersonalNo时,其他为空,但它们都具有相同(相等)StringId它是正确的(应为0) Comp1 Jennifer 4920225088 55443 Comp3 Jennifer '' 55443 解决方法我希望我理解你的要求..可能有其他方法可以做到这一点,但个人我可能会使用临时表进行临时工作,如果是我这样做.. --select data into a temp table that can be modified
select
*
into #cleaned
from
table
--apply personal numbers based on other records with matching string id
--you could take note of the records you are doing this to for data clean up
update c
set c.personalNo = s.personalNo
from #cleaned as c
inner join table as s
on c.name = s.name
and c.stringID = s.stringID
and c.personalNo is null
and s.personalNo is not null
--find all records with non matching string ids
select
name,count(*) as numIDs
into #issues
from(
select
name,stringID
from
#cleaned
group by
name,stringID
) as i
group by
name,PersonalNo
having
count(*) > 1
--select data for viewing.
select
distinct
s.name,case
when i.name is not null then 1
else 0
end as issue
from
#cleaned as s
left outer join #issues as i
on s.name = i.name
and s.personalNo = i.personalNo
order by issue desc
SQLFiddle:http://sqlfiddle.com/#!3/f4aab/7 抱歉,如果这里有虫子,但我相信你会得到这个想法,它不是火箭科学,只是另一种方法 编辑:刚刚注意到你对没有字符串ID的行感兴趣..只是如果它是唯一的行,那么它不是问题.我修改了第一个select(into #cleaned)以获取所有行. 编辑:没有临时表现在你知道它在做什么,这里是没有任何临时表的相同的东西 – 但警告这更新源表分配丢失的personalNo的 update c
set c.personalNo = s.personalNo
from table1 as c
inner join table1 as s
on c.name = s.name
and c.stringID = s.stringID
and c.personalNo is null
and s.personalNo is not null
select
distinct
s.name,case
when i.name is not null then 1
else 0
end as issue
from
table1 as s
left outer join (
select
name,count(*) as numIDs
from(
select
name,stringID
from
table1
group by
name,stringID
) as i
group by
name,PersonalNo
having
count(*) > 1
)
as i
on s.name = i.name
and s.personalNo = i.personalNo
order by issue desc
SQLFiddle:http://sqlfiddle.com/#!3/f4aab/8 分区我没看到我将如何在这里使用分区,因为你想要做的只是知道是否有多行,我使用更复杂的制表分区,或者如果我要对更新数据的判断调用的结果进行排名基于更复杂的规则..但无论如何这里是一个禁止分区的乌鸦:D Select
name,personalNo,case
when numstrings > 1 then 1
else 0 end as issue
from
(select
name,row_number() over (partition by
name,personalNo
order by
name,stringID
) as numstrings
from
#cleaned
group by
name,stringid) as d
order by
issue desc
注意:这使用了如上所述的#cleaned表,因为我认为这使得难以解决的问题的关键是有时候缺少的个人诺. 没有临时表,没有更新 (编辑:安卓应用网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
